Re: Clearing a glock

Scooter Morris <scooter@xxxxxxxxxxxx> · Tue, 27 Jul 2010 05:57:06 -0700

 On 7/27/10 5:15 AM, Steven Whitehouse wrote:
Hi,

If you translate a5b67f into decimal, then that is the inode number of
the inode which is causing a problem. It looks to me as if you have too
many processes trying to access this one inode from multiple nodes.

Its not obvious from the traces that anything is actually stuck, but if
you take two traces, a few seconds or minutes apart, then it should
become more obvious whether the cluster is making progress or whether it
really is stuck,

Steve.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
Hi Steve,
    As always, thanks for the reply.  The cluster was, indeed, truly 
stuck.  I rebooted it last night to clear everything out.  I never did 
figure out which file was the problem.  I did a find -inum, but the find 
hung too.  By that point the load average was up to 80 and climbing.  
Any ideas on how to avoid this?  Are there tunable values I need to 
increase to allow more processes to access any individual inode?

Thanks!

-- scooter

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster