On Thu, 2005-03-17 at 23:39, David Teigland wrote: > > Were you using clvm (or specifically, was this node running clvmd)? If > not, then the unmount would mean stopping all the dlm threads. That's > something we seldom do in our testing because clvmd is always still using > the dlm. Starting clvmd on your nodes, even if you don't use it, would > avoid unmount stopping dlm_astd which may avert the problem. > > I just ran across a possibly related problem where kthread_stop() couldn't > stop dlm_astd. dlm_astd was in wait_event_interruptible() instead of > spinning, though. The fix was to simply get rid of the unnecessary > wait_queue and the wait_event. I'm hoping that might fix the problem > you're seeing, too. I've attached the patch. I was not using clvmd. I grabbed a bunch of info off the node that was hung in umount with dlm_astd spinning. The data is here: http://developer.osdl.org/daniel/GFS/test.14mar2005/ I'll apply your patch and try again. Daniel