I think TCP_NODELAY is critical to performance. Actuall after spending a large number of unfruitful hours on glusterfs, I wrote my own simple shared storage with BerkeleyDB backend, and I found that enabling TCP_NODELAY on my system gives me nearly 10x readback throughput. Thanks for pointing this out, I'll definitely try that. - Wei Mark Mielke wrote: > On 09/29/2009 03:39 AM, David Saez Padros wrote: >>> The >>> second is 'option transport.socket.nodelay on' in each of your >>> protocol/client _and_ protocol/server volumes. >> >> where is this option documented ? > > I'm a little surprised TCP_NODELAY isn't set by default? I set it on > all servers I write as a matter of principle. > > The Nagle algorithm is for very simple servers to have acceptable > performance. The type of servers that benefit, are the type of servers > that do writes of individual bytes (no buffering). > > Serious servers intended to perform well should be able to easily beat > the Nagle algorithm. writev(), sendmsg(), or even write(buffer) where > the buffer is built first, should all beat the Nagle algorithm in > terms of increased throughput and reduced latency. On Linux, there is > also TCP_CORK. Unless GlusterFS does small writes, I suggest > TCP_NODELAY be set by default in future releases. > > Just an opinion. :-) > > Cheers, > mark >