> On Mar 10, 2016, at 11:10 AM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> wrote: > >>>>>>>>>> Moving the QP into error state right after with rdma_disconnect >>>>>>>>>> you are not sure that none of the subset of the invalidations >>>>>>>>>> that _were_ posted completed and you get the corresponding MRs >>>>>>>>>> in a bogus state... >>>>>>>>> >>>>>>>>> Moving the QP to error state and then draining the CQs means >>>>>>>>> that all LOCAL_INV WRs that managed to get posted will get >>>>>>>>> completed or flushed. That's already handled today. >>>>>>>>> >>>>>>>>> It's the WRs that didn't get posted that I'm worried about >>>>>>>>> in this patch. >>>>>>>>> >>>>>>>>> Are there RDMA consumers in the kernel that use that third >>>>>>>>> argument to recover when LOCAL_INV WRs cannot be posted? >>>>>>>> >>>>>>>> None :) >>>>>>>> >>>>>>>>>>> I suppose I could reset these MRs instead (that is, >>>>>>>>>>> pass them to ib_dereg_mr). >>>>>>>>>> >>>>>>>>>> Or, just wait for a completion for those that were posted >>>>>>>>>> and then all the MRs are in a consistent state. >>>>>>>>> >>>>>>>>> When a LOCAL_INV completes with IB_WC_SUCCESS, the associated >>>>>>>>> MR is in a known state (ie, invalid). >>>>>>>>> >>>>>>>>> The WRs that flush mean the associated MRs are not in a known >>>>>>>>> state. Sometimes the MR state is different than the hardware >>>>>>>>> state, for example. Trying to do anything with one of these >>>>>>>>> inconsistent MRs results in IB_WC_BIND_MW_ERR until the thing >>>>>>>>> is deregistered. >>>>>>>> >>>>>>>> Correct. >>>>>>>> >>>>>>> >>>>>>> It is legal to invalidate an MR that is not in the valid state. So you >>>>> don't >>>>>>> have to deregister it, you can assume it is valid and post another LINV >>> WR. >>>>>> >>>>>> I've tried that. Once the MR is inconsistent, even LOCAL_INV >>>>>> does not work. >>>>>> >>>>> >>>>> Maybe IB Verbs don't mandate that invalidating an invalid MR must be >>> allowed? >>>>> (looking at the verbs spec now). >>>> >>> >>> IB Verbs doesn't have specify this requirement. iW verbs does. So > transport >>> independent applications cannot rely on it. So ib_dereg_mr() seems to be > the >>> only thing you can do. >>> >>>> If the MR is truly invalid, then there is no issue, and >>>> the second LOCAL_INV completes successfully. >>>> >>>> The problem is after a flushed LOCAL_INV, the MR state >>>> sometimes does not match the hardware state. The MR is >>>> neither registered or invalid. >>>> >>> >>> There is a difference, at least with iWARP devices, between the MR state: > VALID >>> vs INVALID, and if the MR is allocated or not. >>> >>>> A flushed LOCAL_INV tells you nothing more than that the >>>> LOCAL_INV didn't complete. The MR state at that point is >>>> unknown. >>>> >>> >>> With respect to iWARP and cxgb4: when you allocate a fastreg MR, HW has an >> entry >>> for that MR and it is marked "allocated". The MR record in HW also has a > state: >>> VALID or INVALID. While the MR is "allocated" you can post WRs to > invalidate it >>> which changes the state to INVALID, or fast-register memory which makes it >>> VALID. Regardless of what happens on any given QP, the MR remains > "allocated" >>> until you call ib_dereg_mr(). So at least for cxgb4, you could in fact just >>> post another LINV to get it back to a known state that allows subsequent >>> fast-reg WRs. >>> >>> Perhaps IB devices don't work this way. >>> >>> What error did you get when you tried just doing an LINV after a flush? >> >> With CX-2 and CX-3, after a flushed LOCAL_INV, trying either >> a FASTREG or LOCAL_INV on that MR can sometimes complete with >> IB_WC_MW_BIND_ERR. > > > I wonder if you post a FASREG+LINV+LINV if you'd get the same failure? IE > invalidate the same rkey twice. Just as an experiment... Once the MR is in this state, FASTREG does not work either. All FASTREG and LINV flush with IB_WC_MW_BIND_ERR until the MR is deregistered. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html