On Tue, 01 Sep 2009 11:33:38 +0530 Shehjar Tikoo <shehjart at gluster.com> wrote: > Stephan von Krawczynski wrote: > > On Mon, 31 Aug 2009 19:48:46 +0530 Shehjar Tikoo > > <shehjart at gluster.com> wrote: > > > >> Stephan von Krawczynski wrote: > >>> Hello all, > >>> > >>> after playing around for some weeks we decided to make some real > >>> world tests with glusterfs. Therefore we took a nfs-client and > >>> mounted the very same data with glusterfs. The client does some > >>> logfile processing every 5 minutes and needs around 3,5 mins > >>> runtime in a nfs setup. We found out that it makes no sense to > >>> try this setup with gluster replicate as long as we do not have > >>> the same performance in a single server setup with glusterfs. So > >>> now we have one server mounted (halfway replicate) and would > >>> like to tune performance. Does anyone have experience with some > >>> simple replacement like that? We had to find out that almost all > >>> performance options have exactly zero effect. The only thing > >>> that seems to make at least some difference is read-ahead on the > >>> server. We end up with around 4,5 - 5,5 minutes runtime of the > >>> scripts, which is on the edge as we need something quite below 5 > >>> minutes (just like nfs was). Our goal is to maximise performance > >>> in this setup and then try a real replication setup with two > >>> servers. The load itselfs looks like around 100 scripts starting > >>> at one time and processing their data. > >>> > >>> Any ideas? > >>> > >> What nfs server are you using? The in-kernel one? > > > > Yes. > > > >> You could try the unfs3booster server, which is the original unfs3 > >> with our modifications for bug fixes and slight performance > >> improvements. It should give better performance in certain cases > >> since it avoids the FUSE bottleneck on the server. > >> > >> For more info, do take a look at this page: > >> http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration > >> > >> When using unfs3booster, please use GlusterFS release 2.0.6 since > >> that has the required changes to make booster work with NFS. > > > > I read the docs, but I don't understand the advantage. Why should we > > use nfs as kind of a transport layer to an underlying glusterfs > > server, when we can easily export the service (i.e. glusterfs) > > itself. Remember, we don't want nfs on the client any longer, but a > > replicate setup with two servers (though we do not use it right now, > > but nevertheless it stays our primary goal). > > Ok. My answer was simply under the impression that moving to NFS > was the motive. unfs3booster-over-gluster is a better solution as > opposed to having kernel-nfs-over-gluster because of the avoidance of > the FUSE layer completely. Sorry. To make that one clear again: I don't want to use NFS if not ultimately necessary. I would be happy to use a complete glusterfs environment without any patches and glues to nfs, cifs or the like. > > It sounds obvious to me > > that a nfs-over-gluster must be slower than a pure kernel-nfs. On the > > other hand glusterfs per se may even have some advantages on the > > network side, iff performance tuning (and of course the options > > themselves) is well designed. The first thing we noticed is that load > > dropped dramatically both on server and client when not using > > kernel-nfs. Client dropped from around 20 to around 4. Server dropped > > from around 10 to around 5. Since all boxes are pretty much > > dedicated to their respective jobs a lot of caching is going on > > anyways. > Thanks, that is useful information. > > So I > > would not expect nfs to have advantages only because it is > > kernel-driven. And the current numbers (loss of around 30% in > > performance) show that nfs performance is not completely out of > > reach. > That is true, we do have setups performing as well and in some > cases better than kernel NFS despite the replication overhead. It > is a matter of testing and arriving at a config that works for your > setup. > > > > > > What advantages would you expect from using unfs3booster at all? > > > To begin with, unfs3booster must be compared against kernel nfsd and not > against a GlusterFS-only config. So when comparing with kernel-nfsd, one > should understand that knfsd involves the FUSE layer, kernel's VFS and > network layer, all of which have their advantages and also > disadvantages, especially FUSE when using with the kernel nfsd. Those > bottlenecks with FUSE+knfsd interaction are well documented elsewhere. > > unfs3booster enables you to avoid the FUSE layer, the VFS, etc and talk > directly to the network and through that, to the GlusterFS server. In > our measurements, we found that we could perform better than kernel > nfs-over-gluster by avoiding FUSE and using our own caching(io-cache), > buffering(write-behind, read-ahead) and request scheduling(io-threads). > > > Another thing we really did not understand is the _negative_ effect > > of adding iothreads on client or server. Our nfs setup needs around > > 90 nfs kernel threads to run smoothly. Every number greater than 8 > > iothreads reduces the performance of glusterfs measurably. > > > > The main reason why knfsds need a higher number of threads is simply > because knfsd threads are highly io-bound, that is they wait for for the > disk IO to complete in order to serve each NFS request. > > On the other hand, with io-threads, the right number actually depends on > the point at which io-threads are being used. For eg, if you're using > io-threads just above the posix or features/locks, the scenario is much > like kernel nfsd threads, where each io-thread blocks till the disk IO > is complete. Is this is where you've observed that 8 iothread drop-off? > If so, then it is something we'll need to investigate. > > The other place where you can you can use io-threads is on the GlusterFS > client side, it is here that the 8 thread drop-off seems possible since > the client side in GlusterFS is more CPU hungry than the server, and it > is possible that 8 io-threads are able to consume as much CPU as is > available for GlusterFS. Have you observed what the CPU usage figures > are as you increase the number of io-threads? > > How many CPUs did the machine have when you observed the drop-off beyond > 8 threads? The client box has a Quad Core 2 CPU, the server is dual AMD Opteron 246. I currently try a pretty simple setup with this client vol file: volume remote1 type protocol/client option transport-type tcp option remote-host 192.168.82.1 option remote-subvolume testfs end-volume volume remote2 type protocol/client option transport-type tcp option remote-host 192.168.82.2 option remote-subvolume testfs end-volume volume replicate type cluster/replicate option data-self-heal on option metadata-self-heal on option entry-self-heal on subvolumes remote1 remote2 end-volume volume writebehind type performance/write-behind #option aggregate-size 1MB # option is unkown # option window-size 1MB option cache-size 2MB #option block-size 1MB # option is unkown option flush-behind on subvolumes replicate end-volume I tried read-ahead and others but they don't boost performance at all. Only writebehind has some positive effect of around 30s runtime (drops from around 5 mins to 4:30 mins). Unfortunately I need a further improvement down to around 4 mins runtime, because now every now and then the 5 mins barrier is hit. Remember that the remote2 server is down in this setup. We use only remote1 currently. Do you have any ideas how to improve performance? > -Shehjar -- Regards, Stephan