Hi,
Few days ago, I reported a problem with a heavy load on our cluster
login node. I can reproduce the problem, ps axl | grep " D" gives:
0 2001 2368 1 16 0 66088 1572 just_s Ds ? 0:00 -bash
0 0 9854 23584 18 0 63260 800 pipe_w S+ pts/6 0:00 grep D
4 0 11972 1 15 0 66216 1632 just_s Ds ? 0:00 -bash
0 0 21576 1 17 0 73916 824 just_s D ? 0:00 ls
--color=tty /scratch/lctmm
0 2001 26178 1 15 0 66092 1600 just_s Ds ? 0:00 -bash
Actually the problem is related to various attempts to access the
directory /scratch/lctmm which simply freezes the login bash and turns
it to a zombie status. Unfortunately, I can't kill these zombies and I
have to reboot the login node.
The /scratch volume is gfs2. I can access to other /scratch
subdirectories. Has someone already experienced such a problem?
--
Nicolas Ferre'
Laboratoire Chimie Provence
Universite' de Provence - France
Tel: +33 491282733
http://sites.univ-provence.fr/lcp-ct
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster