Marc Grimme wrote:
Hello,
does anybody now what exactly is the task of gfs_scand. We see it with very
much CPU time loads of times (eg. system is up for 40h and gfs_scand has 4h
CPU-Time).
And can you track down which scand is responsible for what filesystem?
BTW: I'm talking about RHEL4U4.
This is a complicated subject. So please bear with me and see whether
the following description helps:
Gfs_scand scans GFS locks (glock) hash table to find:
1. if glock can be downgraded into less restricted state (say from
shared state to unlock state) (and dirty data flushing is embedded in
the glock transition code).
2. if glock is idle and in unlock state for too long, it will be reclaimed.
Whenever GFS needs a lock, it creates a glock and subsequently asks lock
manager for a corresponding lock. In DLM case, there is one-to-one
correspondence between glock and dlm lock.
Now if gfs_scand has used too much CPU time, it may mean the system has
accumulated too many locks as described in:
http://people.redhat.com/wcheng/Patches/GFS/readme.gfs_glock_trimming.R4
Unfortunately the lock trimming patch added into RHEL 4.5 is too "mild"
(i.e. not aggressive enough, see Red Hat bugzilla 245776). We'll try to
correct the issue as soon as next errata is available. In short, if the
daemon has hogged too much CPU time without any sign of slowing down
whenever it wakes up, you can try to make it run less often by:
shell> gfs_tool settune <mount_point> scand_secs <x>
// the default x is 5 seconds
The side effect of longer scand_secs is that if you have large amount of
file write and/or delete activities, the dirty data will stay in the
buffer cache for longer time and lock count will up considerably.
-- Wendy
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster