RE: how to re-use a QP for a new connection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Steve Wise is helping me with a particular issue where QP re-use might
> be helpful.
> 
> When an RPC/RDMA transport connection is dropped (for example, the NFS
> server crashes), xprtrdma destroys the transport's QP and creates a
> new one for the next connection.

If the remote side crashes, the local QP can transition into the error state, which would flush all posted receives.  I believe that a WR that has completed in error only has the wr_id field valid.

Note that calling rdma_disconnect() will also transition the QP into the error state.
 
> We're not quite sure what IB_WC_WR_FLUSH_ERR means in that instance. Our
> theory is there is a gap when the old QP is destroyed:
> 
> 1. If the HW reports a successful WR completion but the QP no longer
>    exists, the provider substitutes an IB_WC_WR_FLUSH_ERR completion
> 
> 2. If the WR is dropped before the HW even saw it, the provider inserts
>    an IB_WC_WR_FLUSH_ERR completion
> 
> So if xprtrdma is trying to submit a FAST_REG_MR WR and the completion
> gets flushed, xprtrdma has no way to know whether the rkey was bumped in
> the adapter. Thus it has no certainty which rkey to use to invalidate
> that FRMR.

I'm not familiar with the behavior of fast reg mr.
 
> I was idly wondering whether re-using the QP during connection loss
> would provide a guarantee that xprtrdma would never see case 1 above.
> Then IB_WC_WR_FLUSH_ERR on a FAST_REG_MR WR would be a more certain
> indication that the HW still has the old rkey.
> 
> I suppose that xprtrdma can "hang onto" the QP without re-using it by
> simply not destroying it until all WRs scheduled on the old QP are
> completed. Is reference counting the QP the usual design pattern to deal
> with this case?

I _thought_ that destroying the QP would cleanup any completion entries in the CQ, but I'm not sure of this.  Referencing counting should work though. 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux