Hello, Is your head node an NFS server, and are the jobs writing to the NFS share? On Wed, Jun 26, 2013 at 3:27 PM, Doll, Margaret Ann <margaret_doll@xxxxxxxxx > wrote: > I have a computer cluster Running rocks 5.2, Centos 6. > > The head node is over loaded. There are 2 CPUs on the head node. > > top - 14:27:49 up 1 day, 6:11, 6 users, load average: 13.65, 14.12, > 13.92 > Tasks: 168 total, 3 running, 163 sleeping, 0 stopped, 2 zombie > Cpu(s): 1.2%us, 1.9%sy, 0.0%ni, 0.0%id, 91.7%wa, 1.0%hi, 4.1%si, > 0.0%st > Mem: 2053088k total, 2001464k used, 51624k free, 74476k buffers > Swap: 1020116k total, 388k used, 1019728k free, 1638076k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND > > 2515 nobody 15 0 218m 3176 1048 S 2.3 0.2 8:46.23 > gmetad > 2967 root 15 0 0 0 0 S 2.0 0.0 0:20.31 > nfsd > 2970 root 15 0 0 0 0 R 1.0 0.0 0:20.60 > nfsd > 3110 nobody 15 0 198m 20m 3360 S 0.3 1.0 4:22.71 > gmond > 29788 mad 15 0 90736 2336 1084 S 0.3 0.1 0:02.91 > sshd > 1 root 15 0 10372 684 572 S 0.0 0.0 0:00.51 > init > 2 root RT -5 0 0 0 S 0.0 0.0 0:00.00 > migration/0 > 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 > ksoftirqd/0 > 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 > > I have everyone logged off of the head node. Four jobs are running on the > compute nodes, but I believe they are non-parallel jobs which causes no > traffic on the head node. The load_avg on each of the compute nodes is > less than 8. Each compute node has 8 CPUs. > > How can I find the problem? I have seen the zombies go as high as 2 on > the head node; most of the time there are 0 zombies. > > I did reboot the head node, but the problem comes back fairly quickly. > -- > redhat-list mailing list > unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe > https://www.redhat.com/mailman/listinfo/redhat-list > -- Jonathan Billings <jsbillin@xxxxxxxxx> College of Engineering - CAEN - Unix and Linux Support -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list