Performance question

B.Candler at pobox.com (Brian Candler) · Mon, 13 Feb 2012 14:13:05 +0000

On Mon, Feb 13, 2012 at 03:07:53PM +0100, Arnold Krille wrote:
> > If I understand it right (and I'm quite new to gluster myself), whenever you
> > do a read on a replicated volume, gluster dispatches the operation to both
> > nodes, waits for both results to come back, and checks they are the same
> > (and if not, works out which is wrong and kicks off a self-heal operation)
> > 
> > http://www.youtube.com/watch?v=AsgtE7Ph2_k
> > 
> > And of course, writes have to be dispatched to both nodes, and won't
> > complete until the slowest has finished.  This may be the reason for your
> > poor latency.
> 
> I understand that writes have to happen on all (running) replicas and only 
> return when the last finished (like the C-protocol with drbd). But reads can 
> (and should) happen from the nearest only. Or from the fastest. With two nodes 
> you can't decide which node has the 'true' data except to check for the 
> attributes.

The attributes say which was written most recently, and (as I understand it)
that information is used to decide which is the correct one.

> NFS on these nodes is limited by the Gigabit-Network-Performance and the disk 
> and results in min(120MBit, ~100MBit) from network and disk.
> But I will run dbench on the nfs shares (without gluster) this evening.

Yes, I think a useful comparison would be:

* NFS
* Gluster with single disk volume [or distributed-only volume]
* Gluster with replicated volume [or replicated/distributed volume]

Using the same dbench parameters in each case, of course.

Regards,

Brian.