On Wednesday 17 October 2007 09:52:20 Gordan Bobic wrote: > On Wed, 17 Oct 2007, Marc Grimme wrote: > >>>> I have a cluster (3 nodes at the moment, may grow up to 16) for > >>>> handling a lot of small files (Maildir). When I test the system by > >>>> sending around 3-5 messages/second I see the load on the cluster nodes > >>>> go up to about 20-30, even though the CPUs on the cluster are about > >>>> 90% idle at all times. > >>>> > >>>> I am guessing that this is due to the clustered machines waiting for > >>>> DLM locks to be established, which causes a lot of processes to be > >>>> fighting to run, but since they don't get to run very soon, they back > >>>> up and cause the load averages to go up. > >>>> > >>>> Assuming the DLM runs over the interface specified by IP and MAC in > >>>> cluster.conf, it is running over gigabit ethernet. > >>>> > >>>> Are there any configuration changes or tuning parameters I can apply > >>>> to DLM to alleviate this condition? The machine I'm running the test > >>>> from (the one sending messages) is about 1/4 of the spec of each of > >>>> the cluster nodes, and it's running a load average of about 0.4. It > >>>> seems crazy that a single low-spec node should be able to completely > >>>> overwhelm a cluster 12x it's spec several times over. > >>> > >>> I don't know alot about GFS but since no one else has replied yet, my > >>> understanding is that it's not suitable for an applications like what > >>> you describe (many small files being opened frequently). I think GFS2, > >>> which is still a tech preview, has been redesigned to improve this > >>> situation. > >> > >> Indeed, I am aware that GFS2 is still broken, but I seem to be getting > >> no worse a performance out of GFS than I get out of NFS. The only > >> penalty is the high load, but the throughput is actually similar. The > >> advantage that makes GFS win is that I don't need an arbitrating server > >> to handle the NFS exports, which makes the clustering and redundancy a > >> bit tidier. > > > > with your testing did you also try to adapt the size of the > > rsbtbl_size/lkbtbl_size? I would be quite interested if this increases > > your performance or not. > > I cannot find these files in /proc (that's where they are implied to be in > the docs). Can you please point me in the right direction? Sorry I new I forgot something ;-) http://www.opensharedroot.org/Members/marc/blog/blog-on-dlm/red-hat-dlm-__find_lock_by_id/influence-of-locktable-sizes-rsbtbl_size-lkbtbl_size > > > Do you have lot of small files? > > Yes. The problem doesn't seem to be so bad when files are in different > directories, but when lots of files are being written to the same > directory, the load goes up quite badly. Then this should help. Also enable lock_purging if not already done. > > Gordan Marc. > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster