> > > Sure, and all that applies equally to both NFS and gluster, yet in Max's > example NFS was ~50x faster than gluster for an identical small-file > workload. So what's gluster doing over and above what NFS is doing that's > taking so long, given that network and disk factors are equal? I'd buy a > factor of 2 for replication, but not 50. > > When using FUSE, the context switch the syscall undergoes even before glusterfs gets a hand on it is a _huge_ factor, especially when (wrongly) comparing with local filesystems. > In case you missed what I'm on about, it was these stats that Max posted: > > > Here is the results per command: > > dd if=/dev/zero of=M/tmp bs=1M count=16384 69.2 MB/se (Native) 69.2 > > MB/sec(FUSE) 52 MB/sec (NFS) > This test looks reasonable. Writes seem to be bottlenecked at the sustained write throughput of the disk itself. > > dd if=/dev/zero of=M/tmp bs=1K count=163840000 88.1 MB/sec (Native) > > 1.1MB/sec (FUSE) 52.4 MB/sec (NFS) > The huge drop of FUSE performance compared to NFS is due to the context switch overhead (which glusterfs cannot to much, as it is the latency coming in much before glusterfs even comes into the picture). Since both glusterfs and NFS does caching of writes, the comparison is really ending up being the latency of the context switch v/s no context switch (disregarding the network latency completely due to client side caching) - i.e just syscall delivery to the FS is more expensive in native glusterfs compared to NFS with no consideration of what each of them do after the syscall has been delivered. > > time tar cf - M | pv > /dev/null 15.8 MB/sec (native) 3.48MB/sec > > (FUSE) 254 Kb/sec (NFS) > This test shows why glusterfs native protocol is better than NFS when you need to scale out storage. Even with a context switch overhead on the client side, glusterfs scores better due to the "clustered nature" of its protocol. NFS has to undergo a second hop when it has to fetch data not available in the server it has mounted from whereas for glusterfs it is always a single hop to any server it wants to get data from. In any case comparing to local disk performance and network disk performance is never right and is always misleading. Avati