On Wed, 2007-01-03 at 12:35 +0100, Marco Lusini wrote: > Hi all, > > I have 3 2-node clusters, running just cluster suite, without gfs, > each one updated with the latest > packages released by RHN. > > In each cluster one of the two nodes has a steadily growing system CPU > usage, which seems > to be consumed by clurgmgrd and dlm_recvd. > As an example here is the running time accumulated on one cluster > since 20 december when > oit was rebooted: > > [root@estestest ~]# ps axo pid,start,time,args > PID STARTED TIME COMMAND > ... > 10221 Dec 20 10:37:05 clurgmgrd > 11169 Dec 20 06:48:24 [dlm_recvd] > ... > > [root@frascati ~]# ps axo pid,start,time,args > PID STARTED TIME COMMAND > ... > 6226 Dec 20 00:04:17 clurgmgrd > 8249 Dec 20 00:00:19 [dlm_recvd] > ... > > I attach two graphs made with RRD which show that the system CPU usage > is steadily growing: > note how the trend changed after the reboot on 20 december. > Of course as the system usage increases so does the system load and I > am afraid of what will > happen after 1-2 months of uptime... System load averages are the average of the number of processes on the run queue over the past 1, 5, and 15 minutes. It doesn't generally trend upwards over time; if that were the case, I'd be in trouble: ... 28204 15:11:11 01:04:19 /usr/lib/firefox-1.5.0.9/firefox-bin -UILocale en-US ... However, it is a little odd that you had 10 hours of runtime for clurgmgrd and over 6 for dlm_recvd. Just taking a wild guess, but it looks like the locks were all mastered on frascati. How many services are you running? Also, take a look at: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212634 The RPMs there might solve the problem with dlm_recvd. Rgmanager in some situations causes a strange leak of NL locks in the DLM. If dlm_recvd has to traverse lock lists and that list is ever-growing (total speculation here), it could explain the amount of consumed system time. -- Lon
Attachment:
signature.asc
Description: This is a digitally signed message part
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster