Re: User process NFS write hang followed by automount hang requiring reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 24 May 2019 at 13:32, Alan Post <adp@xxxxxxxxx> wrote:
>
> On Tue, May 21, 2019 at 03:46:03PM +0000, Trond Myklebust wrote:
> > Have you tried upgrading to 4.19.44? There is a fix that went in not
> > too long ago that deals with a request leak that can cause stack traces
> > like the above that wait forever.
> >
>
> Following up on this.  I have set aside a rack of machines and put
> Linux 4.19.44 on them.  They ran jobs overnight and will do the
> same over the long weekend (Memorial day in the US).  Given the
> error rate (both over time and over submitted jobs) we see across
> the cluster this well be enough time to draw a conclusion as to
> whether 4.19.44 exhibits this hang.
>
> Other than stack traces, what kind of information could I collect
> that would be helpful for debugging or describing more precisely
> what is happening to these hosts?  I'd like to exit from the condition
> of trying different kernels (as you no doubt saw in my initial message
> I've done a lot of it) and enter the condition of debugging or
> reproducing the problem.
>
> I'll report back early next week and appreciate your feedback,
>

Perhaps the output from 'cat /sys/kernel/debug/rpc_clnt/*/tasks'?

Thanks
  Trond



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux