RE: Re: [bug report] blktests srp/002 hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Bob Pearson <rpearsonhpe@xxxxxxxxx>
> Sent: Wednesday, 23 August 2023 18:19
> To: Bart Van Assche <bvanassche@xxxxxxx>; Shinichiro Kawasaki
> <shinichiro.kawasaki@xxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx
> Subject: [EXTERNAL] Re: [bug report] blktests srp/002 hang
> 
> On 8/22/23 10:20, Bart Van Assche wrote:
> > On 8/22/23 03:18, Shinichiro Kawasaki wrote:
> >> CC+: Bart,
> >>
> >> On Aug 21, 2023 / 20:46, Bob Pearson wrote:
> >> [...]
> >>> Shinichiro,
> >>
> >> Hello Bob, thanks for the response.
> >>
> >>>
> >>> I have been aware for a long time that there is a problem with
> blktests/srp. I see hangs in
> >>> 002 and 011 fairly often.
> >>
> >> I repeated the test case srp/011, and observed it hangs. This hang at
> srp/011
> >> also can be recreated in stable manner. I reverted the commit
> 9b4b7c1f9f54
> >> then observed the srp/011 hang disappeared. So, I guess these two hangs
> have
> >> same root cause.
> >>
> >>> I have not been able to figure out the root cause but suspect that
> >>> there is a timing issue in the srp drivers which cannot handle the
> slowness of the software
> >>> RoCE implemtation. If you can give me any clues about what you are
> seeing I am happy to help
> >>> try to figure this out.
> >>
> >> Thanks for sharing your thoughts. I myself do not have srp driver
> knowledge, and
> >> not sure what clue I should provide. If you have any idea of the action
> I can
> >> take, please let me know.
> >
> > Hi Shinichiro and Bob,
> >
> > When I initially developed the SRP tests these were working reliably in
> > combination with the rdma_rxe driver. Since 2017 I frequently see issues
> when
> > running the SRP tests on top of the rdma_rxe driver, issues that I do not
> see
> > if I run the SRP tests on top of the soft-iWARP driver (siw). How about
> > changing the default for the SRP tests from rdma_rxe to siw and to let
> the
> > RDMA community resolve the rdma_rxe issues?
> >
> > Thanks,
> >
> > Bart.
> >
> 
> Bart,
> 
> I have also seen the same hangs in siw. Not as frequently but the same
> symptoms.

I did not hear about that one form siw side, but will try to make up some
time to reproduce it and fix siw in case. I'll let you know if I find
something, Bob.

Bernard.

> About every month or so I take another run at trying to find and fix this
> bug but
> I have not succeeded yet. I haven't seen anything that looks like bad
> behavior from
> the rxe side but that doesn't prove anything. I also saw these hangs on my
> system
> before the WQ patch went in if my memory serves. Out main application for
> this
> driver at HPE is Lustre which is a little different than SRP but uses the
> same
> general approach with fast MRs. Currently we are finding the driver to be
> quite stable
> even under very heavy stress.
> 
> I would be happy to collaborate with someone (you?) who knows the SRP side
> well to resolve
> this hang. I think that is the quickest way to fix this. I have no idea
> what SRP is waiting for.
> 
> Best regards,
> 
> Bob




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux