> > What is your opinion of the Paxos algorithm? > > It is slow. But it does solve failure cases. For writes, Paxos is actually more or less optimal (in the non-failure cases, at least). Reads are trickier, but there are ways to keep that fast as well. FWIW, Ceph extends basic Paxos with a leasing mechanism to keep reads fast, consistent, and distributed. It's only used for cluster state, though, not file data. I think the larger issue with Paxos is that I've yet to meet anyone who wants their data replicated 3 ways (this despite newfangled 1TB+ disks not having enough bandwidth to actualy _use_ the data they store). Similarly, if only 1 out of 3 replicas is surviving, most people want to be able to read their data, while Paxos demands a majority to ensure it is correct. (This is why Paxos is typically used only for critical cluster configuration/state, not regular data.) sage -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html