Re: problem with deadlocked processes (D)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Peter Sopko wrote:
> Hi,
> 
> today a strange thing occurred - on both of our cluster nodes a lot of
> processes suddenly started to become locked in the D state (i/o lock). This
> thing has already happened once before (six months ago), but a simple reboot
> helped to solve this issue. But as it appeared again, I don't want to solve
> it this way again, I would like to find the reason why this is happening,
> but have no idea where to start. In /var/log/messages there is nothing
> unusual, the only thing is that some directories are unremoveable and a lot
> of processes locked. 

For problems where processes are getting stuck in D state it's usually
helpful to get sysrq-t data to see where the threads are stuck. Grab two
sets of data a few seconds apart so that you can see if things are
really stuck or just making slow progress.

You can also get some information from the wchan data exposed in /proc -
it's easiest to view with ps:

$ ps ax -ocomm,pid,state,wchan
COMMAND           PID S WCHAN
vim             22322 S -
bash            22471 S -
man             22817 S wait
sh              22820 S wait
sh              22821 S wait
less            22826 S -
bash            22839 S wait
screen          23435 S pause
[...]

Regards,
Bryn.



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFGE5226YSQoMYUY94RAgm0AKDdPg/mcTHilSwMpd6+Meno2zBLtACgt+/j
TT3MsBrg6/gpdBdPDYMEp5Q=
=ADyt
-----END PGP SIGNATURE-----

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux