Re: [bug report] blktests srp/002 hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 24, 2023 / 20:36, Bob Pearson wrote:
> On 8/24/23 20:11, Shinichiro Kawasaki wrote:
> > On Aug 22, 2023 / 08:20, Bart Van Assche wrote:
> >> On 8/22/23 03:18, Shinichiro Kawasaki wrote:
> >>> CC+: Bart,
> >>>
> >>> On Aug 21, 2023 / 20:46, Bob Pearson wrote:
> >>> [...]
> >>>> Shinichiro,
> >>>
> >>> Hello Bob, thanks for the response.
> >>>
> >>>>
> >>>> I have been aware for a long time that there is a problem with blktests/srp. I see hangs in
> >>>> 002 and 011 fairly often.
> >>>
> >>> I repeated the test case srp/011, and observed it hangs. This hang at srp/011
> >>> also can be recreated in stable manner. I reverted the commit 9b4b7c1f9f54
> >>> then observed the srp/011 hang disappeared. So, I guess these two hangs have
> >>> same root cause.
> >>>
> >>>> I have not been able to figure out the root cause but suspect that
> >>>> there is a timing issue in the srp drivers which cannot handle the slowness of the software
> >>>> RoCE implemtation. If you can give me any clues about what you are seeing I am happy to help
> >>>> try to figure this out.
> >>>
> >>> Thanks for sharing your thoughts. I myself do not have srp driver knowledge, and
> >>> not sure what clue I should provide. If you have any idea of the action I can
> >>> take, please let me know.
> >>
> >> Hi Shinichiro and Bob,
> >>
> >> When I initially developed the SRP tests these were working reliably in
> >> combination with the rdma_rxe driver. Since 2017 I frequently see issues when
> >> running the SRP tests on top of the rdma_rxe driver, issues that I do not see
> >> if I run the SRP tests on top of the soft-iWARP driver (siw). How about
> >> changing the default for the SRP tests from rdma_rxe to siw and to let the
> >> RDMA community resolve the rdma_rxe issues?
> > 
> > If it takes time to resolve the issues, it sounds a good idea to make siw driver
> > default, since it will make the hangs less painful for blktests users. Another
> > idea to reduce the pain is to improve srp/002 and srp/011 to detect the hangs
> > and report them as failures.
> > 
> > Having said that, some discussion started on this thread for resolution
> > (thanks!) I would wait for a while and see how long it will take for solution,
> > and if the actions on blktests side are valuable or not.
> 
> Did you see Bart's comment about srp not working with older versions of multipathd?
> He is currently not seeing any hangs at all.

Yes, I saw it. My test system is Fedora 38 with device-mapper-multipathd package
version 0.9.4. I compiled and installed the latest multipath-tools but still see
the hangs. Not sure why it is observed on my test system and not observed on
Bart's system.




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux