Re: [bug report] blktests srp/002 hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 18, 2023 at 01:29:16PM -0500, Bob Pearson wrote:
> On 10/17/23 17:42, Bart Van Assche wrote:
> > On 10/17/23 14:39, Bob Pearson wrote:
> >> On 10/17/23 16:30, Bart Van Assche wrote:
> >>>
> >>> On 10/17/23 14:23, Bob Pearson wrote:
> >>>> Not really, but stuck could mean it died (no threads active) or it is
> >>>> in a loop or waiting to be scheduled. It looks dead. The lower layers are
> >>>> waiting to get kicked into action by some event but it hasn't happened.
> >>>> This is conjecture on my part though.
> >>>
> >>> This call stack means that I/O has been submitted by the block layer and
> >>> that it did not get completed. Which I/O request got stuck can be
> >>> verified by e.g. running the list-pending-block-requests script that I
> >>> posted some time ago. See also
> >>> https://lore.kernel.org/all/55c0fe61-a091-b351-11b4-fa7f668e49d7@xxxxxxx/.
> >>
> >> Thanks. Would this run on the side of a hung blktests or would I need to
> >> setup an srp-srpt file system?
> > 
> > I propose to analyze the source code of the component(s) that you
> > suspect of causing the hang. The output of the list-pending-block-
> > requests script is not sufficient to reveal which of the following
> > drivers is causing the hang: ib_srp, rdma_rxe, ib_srpt, ...
> > 
> > Thanks,
> > 
> > Bart.
> > 
> 
> Bart,
> 
> Another data point. I had seen (months ago) that both the rxe and
> siw drivers could cause blktests srp hangs. More recently when I
> configure my kernel to run lots of tests (lockdep, memory leaks,
> kasan, ubsan, etc.), which definitely slows performance and adds
> delays, the % of srp/002 runs which hang on the rxe driver has gone
> from 10%+- to a solid 100%. This suggested retrying the siw driver
> on the debug kernel since it has the reputation of always running
> successfully. I now find that siw also hangs solidly on srp/002.
> This is another hint that we are seeing a timing issue.

If siw hangs as well, I definitely comfortable continuing to debug and
leaving the work queues in-tree for now.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux