Re: blktest/rxe almost working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 03, 2021 at 04:13:22PM -0700, Bart Van Assche wrote:
> On 9/3/21 3:18 PM, Bob Pearson wrote:
> > On 9/2/21 6:38 PM, Jason Gunthorpe wrote:
> > > On Thu, Sep 02, 2021 at 04:41:15PM -0500, Bob Pearson wrote:
> > > > Now that for-next is on 5.14.0-rc6+ blktest srp/002 is very close to
> > > > working for rxe but there is still one error. After adding MW
> > > > support I added a test to local invalidate to check and see if the
> > > > l/rkey matched the key actually contained in the MR/MW when local
> > > > invalidate is called. This is failing for srp/002 with the key
> > > > portion of the rkey off by one. Looking at ib_srp.c I see code that
> > > > does in fact increment the rkey by one and also has code that posts
> > > > a local invalidate. This was never checked before and is now failing
> > > > to match. If I mask off the key portion in the test the whole test
> > > > case passes so the other problems appear to have been fixed. If the
> > > > increment and invalidate are out of sync this could result in the
> > > > error. I suspect this may be a bug in srp. Worst case I can remove
> > > > this test but I would rather not.
> > > 
> > > I didn't check the spec, but since SRP works with HW devices I wonder
> > > if invalidation is supposed to ignore the variant bits in the mkey?
> > 
> > I am a little worried. srp is pretty complex but roughly it looks like it maintains a pool of
> > MRs which it recycles. Each time it reuses the MR it increments the key portion of the rkey. Before
> > that it uses local invalidate WRs to invalidate the MRs presumably to prevent stray accesses
> > to the old version of the MR from e.g. replicated packets. It posts these WRs to a send queue but I
> > don't see where it closes the loop by waiting for a WC so there may be a race between the invalidate
> > and the subsequent map_sg call. The invalidate marks the MR as not usable so this must all happen
> > before the MR is turned on again.
> 
> Hi Bob,
> 
> If there would be any code in the SRP driver that is not compliant with the
> IBTA specification then I can fix it.
> 
> Regarding the invalidate work requests submitted by the ib_srp driver: these
> are submitted before srp_fr_pool_put() is called. A new registration request
> is submitted after srp_fr_pool_get() succeeds. There is one MR pool per RDMA
> channel and there is one QP per RDMA channel. In other words,
> (re)registration requests are submitted to the same QP as unregistration
> requests after local invalidate requests. I think the IBTA requires does not
> allow to reorder a local invalidate followed by a fast registration request.

Right

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux