Hi Martin: Thanks for your reply indeed. --- Martin Fuerstenau <martin.fuerstenau@xxxxxxx> wrote: > Hi, > > I had (nearly) the same problem. A slow gfs. From > the beginning. Two > weeks ago the cluster crashed every time the load > became heavier. > > What was the reason? A rotten gfs. The gfs uses > leafnodes for data an > leafnodes for metadata whithin the filesystem. And > the problem was in > the metadata leafnodes. > > Have you checked the Filesystem? Unmount it from all > nodes and use > gfs_fsck on the filesystem. No, not yet. I am afraid I cannot umount the file sytem then do the gfs_fsck since the server downtime is totally forbidden. Is there any other way to reclaim the unused or lost blocks ( I guess leafnodes you mentioned meant to be the disk block, correct me if I am wrong.)? Should "gfs_tool settune /mnt/points inoded_secs 10" work for a heavy loaded node with freqent create and delete file operations? >In my case it reported > (and repaired) tons > of unused leafnoedes and some other errors. First > time I started it > without the -y (for yes). Well, after one hour ot > typing y I killed it > and started it with -y. The work was done whithin an > hour for 1TB. Now > the filesystem is clean and it was like a > turboloader and Nitrogen > injection for a car. Fast as it was never before. Great. Sounds fantastic. However, if the low performance is caused by the "rotten" gfs, will your now cleaned file system be possibly messed up again after a certain period? Do you have a smart way to monitor the status of your file system in order to make a regular downtime schedule and "force" your manager to prove it, :-) ? If you do, I am eager to know. Thanks again and look forward to your next reply. Best, Jas > Maybe there is a bug in the mkfs command or so. I > will never use a gfs > without a filesystem check after creation > > Martin Fuerstenau > Seniro System Engineer > Oce Printing Systems, Poing > > On Fri, 2008-05-09 at 02:25 -0700, Ja S wrote: > > Hi, Klaus: > > > > Thank you very much for your kind answer. > > > > Tunning the parameters sounds really interesting. > I > > should give it a try. > > > > By the way, how did you come up with these new > > parameter values? Did you calculate them based on > > some measures or simply pick them up and test. > > > > Best, > > > > Jas > > > > > > --- Klaus Steinberger > > <Klaus.Steinberger@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > Hi, > > > > > > > However, it took ages to list the subdirectory > on > > > an > > > > absolute idle cluster node. See below: > > > > > > > > # time ls -la | wc -l > > > > 31767 > > > > > > > > real 3m5.249s > > > > user 0m0.628s > > > > sys 0m5.137s > > > > > > > > There are about 3 minutes spent on somewhere. > Does > > > > anyone have any clue what the system was > waiting > > > for? > > > > > > Did you tune glock's? I found that it's very > > > important for performance of > > > GFS. > > > > > > I'm doing the following tunings currently: > > > > > > gfs_tool settune /export/data/etp quota_account > 0 > > > gfs_tool settune /export/data/etp glock_purge 50 > > > gfs_tool settune /export/data/etp demote_secs > 200 > > > gfs_tool settune /export/data/etp statfs_fast 1 > > > > > > Switch off quota off course only if you don't > need > > > it. All this tunings have > > > to be done every time after mounting, so do it > in a > > > init.d script running > > > after GFS mount, and of course do it on every > node. > > > > > > Here is the link to the glock paper: > > > > > > > > > http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4 > > > > > > The glock tuning (glock_purge and demote_secs > > > parameters) definitly solved a > > > problem we had here with the Tivoli Backup > Client. > > > Before it was running for > > > days and sometimes even did give up. We observed > > > heavy lock traffic. > > > > > > After changing the glock parameters times for > the > > > backup did go down > > > dramatically, we now can run a Incremental > Backup on > > > a 4 TByte filesystem in > > > under 4 hours. So give it a try. > > > > > > There is some more tuning, which could be done > > > unfortunately just on creation > > > of filesystem. The default number of Resource > Groups > > > is ways too large for > > > nowadays TByte Filesystems. > > > > > > Sincerly, > > > Klaus > > > > > > > > > -- > > > Klaus Steinberger > Beschleunigerlaboratorium > > > Phone: (+49 89)289 14287 Am Coulombwall 6, > D-85748 > > > Garching, Germany > > > FAX: (+49 89)289 14280 EMail: > > > Klaus.Steinberger@xxxxxxxxxxxxxxxxxxxxxx > > > URL: > > > > > > http://www.physik.uni-muenchen.de/~Klaus.Steinberger/ > > > > -- > > > Linux-cluster mailing list > > > Linux-cluster@xxxxxxxxxx > > > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > ____________________________________________________________________________________ > > Be a better friend, newshound, and > > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > -- > > Linux-cluster mailing list > > Linux-cluster@xxxxxxxxxx > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > Martin Fürstenau Tel. : (49) 8121-72-4684 > Oce Printing Systems Fax : (49) 8121-72-4996 > OI-12 E-Mail : > martin.fuerstenau@xxxxxxx > Siemensallee 2 > 85586 Poing > Germany > > > > Visit Oce at drupa! Register online now: > <http://drupa.oce.com> > > This message and attachment(s) are intended solely > for use by the addressee and may contain information > that is privileged, confidential or otherwise exempt > from disclosure under applicable law. > > If you are not the intended recipient or agent > thereof responsible for delivering this message to > the intended recipient, you are hereby notified that > any dissemination, distribution or copying of this > communication is strictly prohibited. > > If you have received this communication in error, > please notify the sender immediately by telephone > and with a 'reply' message. > > Thank you for your co-operation. > > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster