Hi Marc, yesterday the same problem rised again, and I could observe the counters. Btw, I'm using the newest version of RHCS/GFS (GFS-kernel-smp 2.6.9-72.2, GFS 6.1.14-0, rgmanager 1.9.68-1, cman-kernel-smp 2.6.9-50.2, cman 1.0.17-0). on one node, I have 8GB, on the others 4GB RAM. The locks didnt change over time. Thanks! Sebastian gfs_tool counters /global/home locks 2041 locks held 28 freeze count 0 incore inodes 20 metadata buffers 2 unlinked inodes 0 quota IDs 0 incore log buffers 0 log space used 0.10% meta header cache entries 1 glock dependencies 1 glocks on reclaim list 0 log wraps 0 outstanding LM calls 65 outstanding BIO calls 0 fh2dentry misses 0 glocks reclaimed 386 glock nq calls 214090 glock dq calls 214002 glock prefetch calls 148 lm_lock calls 364 lm_unlock calls 234 lm callbacks 593 address operations 0 dentry operations 46654 export operations 0 file operations 90629 inode operations 94213 super operations 173031 vm operations 0 block I/O reads 366 block I/O writes 292 ps axwwww | sort -k4 -n | tail -10 6771 ? S 0:00 [gfs_quotad] 6772 ? S 0:00 [gfs_inoded] 30527 ? Ss 0:00 sshd: root@pts/0 30529 pts/0 Ds+ 0:00 -bash 17499 ? Ss 1:15 /usr/sbin/gmond 3796 ? Sl 2:32 /usr/sbin/gmetad 4251 ? Sl 2:17 /opt/gridengine/bin/lx26-amd64/sge_qmaster 4270 ? Sl 5:33 /opt/gridengine/bin/lx26-amd64/sge_schedd 3606 ? Ss 14:50 /opt/rocks/bin/python /opt/rocks/bin/greceptor 1802 ? R 357:43 df -hP cat /proc /cluster/services Service Name GID LID State Code Fence Domain: "default" 5 2 recover 4 - [3 2 1 11 5 9 6 10 7 8] DLM Lock Space: "clvmd" 7 3 recover 0 - [3 2 1 11 5 9 6 10 7 8] DLM Lock Space: "Magma" 17 5 recover 0 - [3 2 1 11 5 9 6 10 7 8] DLM Lock Space: "homeneu" 19 6 recover 0 - [3 2 1 11 5 9 6 10 7 8] GFS Mount Group: "homeneu" 21 7 recover 0 - [3 2 1 11 5 9 6 10 7 8] User: "usrm::manager" 16 4 recover 0 - [3 2 1 11 5 9 6 10 7 8] Marc Grimme wrote: > On Tuesday 21 August 2007 09:52:32 Sebastian Walter wrote: > >> Hi, >> >> Marc Grimme wrote: >> >>> Do you also see some messages on the console of the nodes. And the >>> gfs_tool >>> counters would help before that problem occures. So let it run sometimes >>> before to see if locks increase. >>> What kind of stress tests are you doing? I bet searching the whole >>> filesystem. What makes me wonder is that the gfs_tool glock_purge does >>> not work whereas it worked for me with exactly the same problems. Did you >>> set it _AFTER_ the fs was mounted? >>> > Sorry I mean after is right and before not ;-( . > And are you using the latest version of CS/GFS? > Do you have a lot of memory in your machines 16G or more? > >> That makes me optimistic. I set it after the volume was mounted, so I >> will give it another try setting it before mounting it. Then I will also >> mail myself the output of the counters every 10 minuts. Let's see... >> > I would be interested in the counters. > Also add the process list in order to see if how much CPU-Time gfs_scand > consumes. > i.e. > ps axwwww | sort -k4 -n | tail -10 > > Have fun Marc. > >> ...with best thanks >> Sebastian >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster