Re: POHMELFS high performance network filesystem. Transactions, failover, performance.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage.

On Wed, May 14, 2008 at 06:35:19AM -0700, Sage Weil (sage@xxxxxxxxxxxx) wrote:
> > > What is your opinion of the Paxos algorithm?
> > 
> > It is slow. But it does solve failure cases.
> 
> For writes, Paxos is actually more or less optimal (in the non-failure 
> cases, at least).  Reads are trickier, but there are ways to keep that 
> fast as well.  FWIW, Ceph extends basic Paxos with a leasing mechanism to 
> keep reads fast, consistent, and distributed.  It's only used for cluster 
> state, though, not file data.

Well, it depends... If we are talking about single node perfromance,
then any protocol, which requries to wait for authorization (or any
approach, which waits for acknowledge just after data was sent) is slow.

If we are talking about agregate parallel perfromance, then its basic
protocol with 2 messages is (probably) optimal, but still I'm not
convinced, that 2 messages case is a good choise, I want one :)

> I think the larger issue with Paxos is that I've yet to meet anyone who 
> wants their data replicated 3 ways (this despite newfangled 1TB+ disks not 
> having enough bandwidth to actualy _use_ the data they store).  
> Similarly, if only 1 out of 3 replicas is surviving, most people want to 
> be able to read their data, while Paxos demands a majority to ensure it is 
> correct.  (This is why Paxos is typically used only for critical cluster 
> configuration/state, not regular data.)

I.e. having more than single node to be failed? Google uses 3-way
replication, but I can not see any factor, which will force people from
lowering failure recovering expectations.

-- 
	Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux