... in practice, however, theory and practice differ.
Or so goes the old saying.
Modern "web scale" distributed systems are full of some pretty neat theory, e.g.,
- Distributed consistent hash algorithms, for example, Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, or Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web.
- Quorum replication storage systems, for example Dynamo: Amazon’s Highly Available Key-value Store, or Cassandra - A Decentralized Structured Storage System.
Although the theory is fascinating, and well worth reading, an interesting question in both of these cases involves how the algorithms are configured in practice. Both the DHT algorithms and the Quorum algorithms have various parameters involving: the total number of nodes in the system, and the rates at which these nodes arrive and leave the system.
Two recent papers explore these configuration choices in more detail:
- A performance vs. cost framework for evaluating DHT design tradeoffs under churn
- Probabilistically Bounded Staleness for Practical Partial Quorums
Both of these papers are strongly tilted toward the "practice" end of the theory/practice continuum, and for that I welcome and appreciate them.
The DHT paper explores the performance of several Distributed Hash Table algorithms in network scenarios that involve dynamic group membership changes:
DHTs incorporate many features to improve lookup performance at extra communication cost in the face of churn. It is misleading to evaluate the performance benefits of an individual design choice alone because other competing choices can be more efficient at using bandwidth. PVC presents designers with a methodology to determine the relative importance of tuning different protocol parameters under different workloads and network conditions. As parameters often control the extent to which a given protocol feature is enabled, PVC allows designers to judge whether a protocol feature is more efficient at using additional bandwidth than others via the analysis of the corresponding protocol parameters.
The Staleness paper explores the behavior of systems that choose to forego the strong consistency guarantees of using overlapping read and write replica sets, and instead use partial quorum configurations:
Employing partial or non-strict quorums lowers operation latency in quorum replication. With partial quorums, sets of replicas written to and read from are not guaranteed to overlap: given N replicas and read and write quorum sizes R and W; partial quorums imply R+W <= N. Modern quorum-based data systems such as Dynamo and its open source descendants Apache Cassandra, Basho Riak, and Project Voldemort offer a choice between these two modes of quorum replication: overlapping quorums, providing strong consistency, and partial quorums, providing eventual consistency.
Neither of these are easy papers; in order to understand the papers, you have to start by understanding the systems and algorithms that are being studied by these papers.
However, if you're interested in the implementation choices faced by those who are building these modern web-scale systems, both these papers offer great insight and a host of new tools for studying and understanding system behavior.
No comments:
Post a Comment