Re: FastLinQ: possible duplicate flush of FastReg and LocalInv

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Thu, 25 Mar 2021 17:26:05 +0000

> On Mar 17, 2021, at 2:39 PM, Tom Talpey <tom@xxxxxxxxxx> wrote:
> 
> On 3/17/2021 11:14 AM, Chuck Lever III wrote:
>>> On Mar 16, 2021, at 3:58 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>>> 
>>> Hi-
>>> 
>>> I've been trying to track down some crashes when running NFS/RDMA
>>> tests over FastLinQ devices in iWARP mode. To make it stressful,
>>> I've enabled disconnect injection, where rpcrdma injects a
>>> connection disconnect every so often.
>>> 
>>> As part of a disconnect event, the Receive and Send queues are
>>> drained. Sometimes I see a duplicate flush for one or more of
>>> memory registration ops. This is not a big deal for FastReq
>>> because its completion handler is basically a no-op.
>>> 
>>> But for LocalInv this is a problem. On a flushed completion, the
>>> MR is destroyed. If the completion occurs again, of course, all
>>> kinds of badness happens because we're DMA-unmapping twice,
>>> touching memory that has already been freed, and deleting from a
>>> list_head that is poisonous.
>>> 
>>> The last straw is that wc_localinv_done calls the generic RPC layer
>>> to indicate that an RPC Reply is ready. The duplicate flush
>>> dereferences one or more NULL pointers.
>> So this looked to me like a Queue wrap. After sleeping on it, I
>> decided to try disabling xprtrdma's Send signal batching. Setting
>> ep_send_batch to zero causes every Send WR to be signaled, and
>> that makes the problem go away.
>> This is a little surprising. Every LocalInv chain is signaled. The
>> only possible accounting error might be that ep_send_count does
>> not count FastReg WRs, which are always unsignaled.
> 
> Well, perhaps you're posting several WRs, and the connection is being
> dropped before you post them all. Therefore, you bail out with the
> last one you did post being unsignaled. You had better hope that last
> one is flushed, because if it completed successfully, you may have a
> missing interrupt.
> 
> It's really tricky to get unsignaled right, when errors occur. It
> might still be the provider, but there are possibilities on both
> sides of the API.

My current theory is that the only duplicate completions occur when
WRs have been posted after a disconnect. This happens in the window
where the workload is still active and the connection has been lost,
but before the DISCONNECTED CM event.

My expectation was that such a WR would flush through and complete
once. What I'm seeing is that on occasion one or more WRs that
were posted in this window complete twice.

If I add some logic to block posting in that window, the duplicate
completion problem seems to go away. The test runs long enough
without a duplication completion that I hit other bugs.

I never see duplicate Receive or Send completions.

When a duplicate completion occurs with LocalInv, I typically see
duplicate completions for all WRs on the same chained post. That
might be the case for FastReg also, I haven't looked closely, but
the Send WR these are chained to never sees a duplicate completion
(could be my duplicate checking logic for Sends doesn't work?).

This is with a QLogic Corp. FastLinQ QL41212HLCU 25GbE Adapter and
Storm FW 8.42.2.0, Management FW 8.30.18.0 [MBI 8.30.29].

--
Chuck Lever