Corfu: Scaling log replication
Corfu: A distributed shared log is a paper running to 25 pages brought to my attention by The Morning Paper. What I really like about Corfu is how it borrows bits and pieces of established techniques and puts them together to get something which is a real breakthrough. Corfu achieves linearizable consistency of a fault tolerant distributed log using very simple write-once techniques. The real eye open is that it implements a distributed log interface without an IO bottleneck giving extreme performance. To quote the papers conclusion:
Despite almost forty years of research into replicated storage schemes, the only approach so far to scale up capacity and throughput has been to shard data and trade consistency for performance. In this article, we presented the CORFU system, which breaks this seeming tradeoff by organizing a cluster of drives as a single, shared log. CORFU offers single-copy semantics at cluster-scale speeds, providing a scalable source of atomicity and durability for distributed systems. CORFU’s novel client-centric design eliminates any single I/O bottleneck between numerous clients and the cluster, allowing data to stream to and from the cluster in parallel.
This series of posts will build up a simple fictional example that demonstrates the key techniques. Then I will glue it all together to describe how Corfu itself works and demystify the role Paxos plays.
The first thing I notice when reading the Corfu paper is that no formal proof of safety is presented. This is a bit of a surprise given that the paper achieves a step change in performance. The reason for this is that Corfu bootstraps itself from a strongly consistent configuration distributed by a conventional Paxos consensus algorithm:
We do not discuss the details of the actual consensus protocol here, as there is abundant literature on the topic. Our current implementation incorporates a Paxos-like consensus protocol using storage units in place of Paxos-acceptors. [page 12]
Using this consistent configuration Corfu builds a distributed log using very simple, and I would say very elegant, mechanisms.
In the following series of posts I will give a quick “from basics” overview of the techniques used in Corfu. I will then explain the Corfu protocol in terms of them.
To name the interesting techniques in the paper:
- Global Log to give Linearizable Consistency covered in Part 2
- Global Sequencer eliminating write contention covered in Part 3
- Stripe And Mirror for “RAID 10”-like resilience and concurrency of IO covered in Part 4
- Copy Collection for garbage collection covered in Part 5
- Write Once data safety, Write Ordering to avoid corruption due to concurrent thread updates, and Collaborating Threads are covered in Part 6
- Paxos Configurator for cluster consistency is covered in Part 7
Those are my names for the techniques. There may be more formal names for what is describe in the following posts. Please post a comment if I have missed a Corfu technique else if a given technique has a more established name.