Hi folks, I have an 8-node cluster running
on an IBM Bladecenter HS21. Using RHEL 5.2, GFS (no GFS2). The nodes are exhibiting
high-cpu load with the following apps: aisexec and cman_tool Both these apps race the cpu
without any other user apps doing much at all. Affectively, the user experience
is dog-slow. After I reboot one of the nodes
it clears up, these apps (aisexec and cman_tool)\ seem to behave, for
awhile. Eventually they race the cpu again days to weeks later. Has anyone ever experienced
this? Top output is below. Thanks, Ed [root@blade1]# top top - 13:47:51 up 40 days,
22:16, 37 users, load average: 4.17, 3.94, 3.86 Tasks: 372 total, 2
running, 369 sleeping, 1 stopped, 0 zombie Cpu(s): 5.9%us,
32.6%sy, 0.0%ni, 61.4%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st Mem: 8311372k
total, 1934844k used, 6376528k free, 76332k
buffers Swap: 8388600k
total, 322976k used, 8065624k free, 443172k
cached PID
USER PR NI VIRT RES SHR S
%CPU %MEM TIME+ COMMAND 4352
root RT 0 37404 35m 2020
R 100 0.4 10519:34 aisexec 20806
root 16 0 1684 560
484 S 42 0.0 8324:49 cman_tool 12501
root 15 0 1680 556
484 S 31 0.0 609:38.46 cman_tool 27245
root 16 0 1688 560
484 S 30 0.0 508:14.31 cman_tool 4635
root 34 19
0 0 0 S 2
0.0 1271:52 kipmi0 5047
root 18 0 405m 17m 6260
S 1 0.2 21:57.04 cimserver 28975
root 15 0 2564 1296 900
R 1 0.0 0:00.05 top 1
root 15 0 2064 576
524 S 0 0.0 0:02.91 init 2
root RT -5
0 0 0 S 0
0.0 0:02.98 migration/0 3
root 34 19
0 0 0 S 0
0.0 0:00.11 ksoftirqd/0 4
root RT -5
0 0 0 S 0
0.0 0:00.00 watchdog/0 5
root RT -5
0 0 0 S 0
0.0 0:01.29 migration/1 |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster