Re: POHMELFS high performance network filesystem. Transactions, failover, performance.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sage Weil wrote:
What is your opinion of the Paxos algorithm?
It is slow. But it does solve failure cases.

For writes, Paxos is actually more or less optimal (in the non-failure cases, at least). Reads are trickier, but there are ways to keep that fast as well. FWIW, Ceph extends basic Paxos with a leasing mechanism to keep reads fast, consistent, and distributed. It's only used for cluster state, though, not file data.

I think the larger issue with Paxos is that I've yet to meet anyone who wants their data replicated 3 ways (this despite newfangled 1TB+ disks not having enough bandwidth to actualy _use_ the data they store).

I've seen clusters in the field that planned for this -- they don't want to lose their data.


Similarly, if only 1 out of 3 replicas is surviving, most people want to be able to read their data, while Paxos demands a majority to ensure it is correct.

This isn't necessarily true -- it's quite easy for most applications to come up with an alternate method for ensuring correctness of retrieved data, if one assumes Paxos consensus was achieved during the write-data phase earlier in time. Checksumming is a common solution, but not the only one. Domain- or app-specific solution, as noted, of course.

Overall, reads can be optimized outside of Paxos in many ways.


(This is why Paxos is typically used only for critical cluster configuration/state, not regular data.)

Yep, I'm working on a config daemon a la Chubby or zookeeper, based on Paxos, that does just this. :)

	Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux