RE: [PATCH v3 05/11] xprtrdma: Do not wait if ib_post_send() fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> >>>>>> Moving the QP into error state right after with rdma_disconnect
> >>>>>> you are not sure that none of the subset of the invalidations
> >>>>>> that _were_ posted completed and you get the corresponding MRs
> >>>>>> in a bogus state...
> >>>>>
> >>>>> Moving the QP to error state and then draining the CQs means
> >>>>> that all LOCAL_INV WRs that managed to get posted will get
> >>>>> completed or flushed. That's already handled today.
> >>>>>
> >>>>> It's the WRs that didn't get posted that I'm worried about
> >>>>> in this patch.
> >>>>>
> >>>>> Are there RDMA consumers in the kernel that use that third
> >>>>> argument to recover when LOCAL_INV WRs cannot be posted?
> >>>>
> >>>> None :)
> >>>>
> >>>>>>> I suppose I could reset these MRs instead (that is,
> >>>>>>> pass them to ib_dereg_mr).
> >>>>>>
> >>>>>> Or, just wait for a completion for those that were posted
> >>>>>> and then all the MRs are in a consistent state.
> >>>>>
> >>>>> When a LOCAL_INV completes with IB_WC_SUCCESS, the associated
> >>>>> MR is in a known state (ie, invalid).
> >>>>>
> >>>>> The WRs that flush mean the associated MRs are not in a known
> >>>>> state. Sometimes the MR state is different than the hardware
> >>>>> state, for example. Trying to do anything with one of these
> >>>>> inconsistent MRs results in IB_WC_BIND_MW_ERR until the thing
> >>>>> is deregistered.
> >>>>
> >>>> Correct.
> >>>>
> >>>
> >>> It is legal to invalidate an MR that is not in the valid state.  So you
> > don't
> >>> have to deregister it, you can assume it is valid and post another LINV
WR.
> >>
> >> I've tried that. Once the MR is inconsistent, even LOCAL_INV
> >> does not work.
> >>
> >
> > Maybe IB Verbs don't mandate that invalidating an invalid MR must be
allowed?
> > (looking at the verbs spec now).
>

IB Verbs doesn't have specify this requirement.  iW verbs does.  So transport
independent applications cannot rely on it.  So ib_dereg_mr() seems to be the
only thing you can do.
 
> If the MR is truly invalid, then there is no issue, and
> the second LOCAL_INV completes successfully.
> 
> The problem is after a flushed LOCAL_INV, the MR state
> sometimes does not match the hardware state. The MR is
> neither registered or invalid.
> 

There is a difference, at least with iWARP devices, between the MR state: VALID
vs INVALID, and if the MR is allocated or not.

> A flushed LOCAL_INV tells you nothing more than that the
> LOCAL_INV didn't complete. The MR state at that point is
> unknown.
> 

With respect to iWARP and cxgb4: when you allocate a fastreg MR, HW has an entry
for that MR and it is marked "allocated".  The MR record in HW also has a state:
VALID or INVALID.  While the MR is "allocated" you can post WRs to invalidate it
which changes the state to INVALID, or fast-register memory which makes it
VALID.  Regardless of what happens on any given QP, the MR remains "allocated"
until you call ib_dereg_mr().  So at least for cxgb4, you could in fact just
post another LINV to get it back to a known state that allows subsequent
fast-reg WRs.

Perhaps IB devices don't work this way.

What error did you get when you tried just doing an LINV after a flush?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux