Re: User process NFS write hang followed by automount hang requiring reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 21, 2019 at 03:46:03PM +0000, Trond Myklebust wrote:
> Have you tried upgrading to 4.19.44? There is a fix that went in not
> too long ago that deals with a request leak that can cause stack traces
> like the above that wait forever.
> 

Following up on this.  I have set aside a rack of machines and put
Linux 4.19.44 on them.  They ran jobs overnight and will do the
same over the long weekend (Memorial day in the US).  Given the
error rate (both over time and over submitted jobs) we see across
the cluster this well be enough time to draw a conclusion as to
whether 4.19.44 exhibits this hang.

Other than stack traces, what kind of information could I collect
that would be helpful for debugging or describing more precisely
what is happening to these hosts?  I'd like to exit from the condition
of trying different kernels (as you no doubt saw in my initial message
I've done a lot of it) and enter the condition of debugging or
reproducing the problem.

I'll report back early next week and appreciate your feedback,

-A
-- 
Alan Post | Xen VPS hosting for the technically adept
PO Box 61688 | Sunnyvale, CA 94088-1681 | https://prgmr.com/
email: adp@xxxxxxxxx



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux