Hi, Martin: Another big thanks to you for your kind reply and suggestions. Best, Jas --- Martin Fuerstenau <martin.fuerstenau@xxxxxxx> wrote: > Hi, > > unfortunaley not. According to my informaiotns > (which are mainly from > this list and from the wiki) for each node of the > cluster this structure > (journal) is established on the filesystem. If you > read the manpage for > gfs_fsck you see, that it must be unmounted from all > nodes. > > If you have the problem I had you should plan a > maintenance window > asap. > > My problem started as mentioned with a slow gfs from > the beginning and > lead to clustercrashs after 7 months. All my > problems were fixed by the > check. Perhaps is the same with your system. > > Yours - Martin > > On Fri, 2008-05-09 at 04:51 -0700, Ja S wrote: > > Hi Martin: > > > > Thanks for your reply indeed. > > > > --- Martin Fuerstenau <martin.fuerstenau@xxxxxxx> > > wrote: > > > > > Hi, > > > > > > I had (nearly) the same problem. A slow gfs. > From > > > the beginning. Two > > > weeks ago the cluster crashed every time the > load > > > became heavier. > > > > > > What was the reason? A rotten gfs. The gfs uses > > > leafnodes for data an > > > leafnodes for metadata whithin the filesystem. > And > > > the problem was in > > > the metadata leafnodes. > > > > > > Have you checked the Filesystem? Unmount it from > all > > > nodes and use > > > gfs_fsck on the filesystem. > > > > No, not yet. I am afraid I cannot umount the file > > sytem then do the gfs_fsck since the server > downtime > > is totally forbidden. > > > > Is there any other way to reclaim the unused or > lost > > blocks ( I guess leafnodes you mentioned meant to > be > > the disk block, correct me if I am wrong.)? > > > > Should "gfs_tool settune /mnt/points inoded_secs > 10" > > work for a heavy loaded node with freqent create > and > > delete file operations? > > > > > > >In my case it reported > > > (and repaired) tons > > > of unused leafnoedes and some other errors. > First > > > time I started it > > > without the -y (for yes). Well, after one hour > ot > > > typing y I killed it > > > and started it with -y. The work was done > whithin an > > > hour for 1TB. Now > > > the filesystem is clean and it was like a > > > turboloader and Nitrogen > > > injection for a car. Fast as it was never > before. > > > > Great. Sounds fantastic. However, if the low > > performance is caused by the "rotten" gfs, will > your > > now cleaned file system be possibly messed up > again > > after a certain period? Do you have a smart way to > > monitor the status of your file system in order to > > make a regular downtime schedule and "force" your > > manager to prove it, :-) ? If you do, I am eager > to > > know. > > > > Thanks again and look forward to your next reply. > > > > Best, > > > > Jas > > > > > > > > > > > Maybe there is a bug in the mkfs command or so. > I > > > will never use a gfs > > > without a filesystem check after creation > > > > > > Martin Fuerstenau > > > Seniro System Engineer > > > Oce Printing Systems, Poing > > > > > > On Fri, 2008-05-09 at 02:25 -0700, Ja S wrote: > > > > Hi, Klaus: > > > > > > > > Thank you very much for your kind answer. > > > > > > > > Tunning the parameters sounds really > interesting. > > > I > > > > should give it a try. > > > > > > > > By the way, how did you come up with these new > > > > parameter values? Did you calculate them based > on > > > > some measures or simply pick them up and test. > > > > > > > > Best, > > > > > > > > Jas > > > > > > > > > > > > --- Klaus Steinberger > > > > <Klaus.Steinberger@xxxxxxxxxxxxxxxxxxxxxx> > wrote: > > > > > > > > > Hi, > > > > > > > > > > > However, it took ages to list the > subdirectory > > > on > > > > > an > > > > > > absolute idle cluster node. See below: > > > > > > > > > > > > # time ls -la | wc -l > > > > > > 31767 > > > > > > > > > > > > real 3m5.249s > > > > > > user 0m0.628s > > > > > > sys 0m5.137s > > > > > > > > > > > > There are about 3 minutes spent on > somewhere. > > > Does > > > > > > anyone have any clue what the system was > > > waiting > > > > > for? > > > > > > > > > > Did you tune glock's? I found that it's > very > > > > > important for performance of > > > > > GFS. > > > > > > > > > > I'm doing the following tunings currently: > > > > > > > > > > gfs_tool settune /export/data/etp > quota_account > > > 0 > > > > > gfs_tool settune /export/data/etp > glock_purge 50 > > > > > gfs_tool settune /export/data/etp > demote_secs > > > 200 > > > > > gfs_tool settune /export/data/etp > statfs_fast 1 > > > > > > > > > > Switch off quota off course only if you > don't > > > need > > > > > it. All this tunings have > > > > > to be done every time after mounting, so do > it > > > in a > > > > > init.d script running > > > > > after GFS mount, and of course do it on > every > > > node. > > > > > > > > > > Here is the link to the glock paper: > > > > > > > > > > > > > > > > > > > > http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4 > > > > > > > > > > The glock tuning (glock_purge and > demote_secs > > > > > parameters) definitly solved a > > > > > problem we had here with the Tivoli Backup > > > Client. > > > > > Before it was running for > > > > > days and sometimes even did give up. We > observed > > > > > heavy lock traffic. > > > > > > > > > > After changing the glock parameters times > for > > > the > > > > > backup did go down > > > > > dramatically, we now can run a Incremental > > > Backup on > > > > > a 4 TByte filesystem in > > > > > under 4 hours. So give it a try. > > > > > > > > > > There is some more tuning, which could be > done > > > > > unfortunately just on creation > > > > > of filesystem. The default number of > Resource > > > Groups > > > > > is ways too large for > > > > > nowadays TByte Filesystems. > > > > > > > > > > Sincerly, > > > > > Klaus > > > > > > > > > > > > > > > -- > > > > > Klaus Steinberger > > > Beschleunigerlaboratorium > > > > > Phone: (+49 89)289 14287 Am Coulombwall 6, > > > D-85748 > > > > > Garching, Germany > > > > > FAX: (+49 89)289 14280 EMail: > > > > > Klaus.Steinberger@xxxxxxxxxxxxxxxxxxxxxx > > > > > URL: > > > > > > > > > > > > > > > http://www.physik.uni-muenchen.de/~Klaus.Steinberger/ > > > > > > -- > > > > > Linux-cluster mailing list > > > > > Linux-cluster@xxxxxxxxxx > > > > > > > > > > > > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > > > > > > > > > > > > > ____________________________________________________________________________________ > > > > Be a better friend, newshound, and > > > > know-it-all with Yahoo! Mobile. Try it now. > > > > > > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > > > > > -- > > > > Linux-cluster mailing list > > > > Linux-cluster@xxxxxxxxxx > > > > > > > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > Martin Fürstenau Tel. : (49) > 8121-72-4684 > > > Oce Printing Systems Fax : (49) > 8121-72-4996 > > > OI-12 E-Mail : > > > martin.fuerstenau@xxxxxxx > > > Siemensallee 2 > > > 85586 Poing > > > Germany > > > > > > > > > > > > Visit Oce at drupa! Register online now: > > > <http://drupa.oce.com> > > > > > > This message and attachment(s) are intended > solely > > > for use by the addressee and may contain > information > > > that is privileged, confidential or otherwise > exempt > > > from disclosure under applicable law. > > > > > > If you are not the intended recipient or agent > > > thereof responsible for delivering this message > to > > > the intended recipient, you are hereby notified > that > > > any dissemination, distribution or copying of > this > > > communication is strictly prohibited. > > > > > > If you have received this communication in > error, > > > please notify the sender immediately by > telephone > > > and with a 'reply' message. > > > > > > Thank you for your co-operation. > > > > > > > > > > > > -- > > > Linux-cluster mailing list > > > Linux-cluster@xxxxxxxxxx > > > > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > > > ____________________________________________________________________________________ > > Be a better friend, newshound, and > > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > -- > > Linux-cluster mailing list > > Linux-cluster@xxxxxxxxxx > > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > Visit Oce at drupa! Register online now: > <http://drupa.oce.com> > > This message and attachment(s) are intended solely > for use by the addressee and may contain information > that is privileged, confidential or otherwise exempt > from disclosure under applicable law. > > If you are not the intended recipient or agent > thereof responsible for delivering this message to > the intended recipient, you are hereby notified that > any dissemination, distribution or copying of this > communication is strictly prohibited. > > If you have received this communication in error, > please notify the sender immediately by telephone > and with a 'reply' message. > > Thank you for your co-operation. > > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster