Re: [PATCH v3] SUNRPC: Fix a race in the receive code path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Dec 14, 2017, at 3:37 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Thu, 2017-12-14 at 14:22 -0500, Chuck Lever wrote:
>>> On Dec 14, 2017, at 2:03 PM, Trond Myklebust <trondmy@primarydata.c
>>> om> wrote:
>>> 
>>> Does the RDMA code update the connect cookie when the connection
>>> breaks? It looks to me as if it only does that when the connection
>>> is
>>> re-established. We really want both.
>>> 
>>>> I imagine this issue could also impact write buffer exhaustion
>>>> on TCP.
>>> 
>>> See net/sunrpc/xprtsock.c:xs_tcp_state_change()
>> 
>> xprtrdma manipulates the connect_cookie in its connect worker,
>> see rpcrdma_connect_worker. This was added by:
>> 
>> commit 575448bd36208f99fe0dd554a43518d798966740
>> Author:     Tom Talpey <talpey@xxxxxxxxxx>
>> AuthorDate: Thu Oct 9 15:00:40 2008 -0400
>> Commit:     Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
>> CommitDate: Fri Oct 10 15:10:36 2008 -0400
>> 
>>    RPC/RDMA: suppress retransmit on RPC/RDMA clients.
>> 
>> Would it be more correct to bump the cookie in rpcrdma_conn_upcall,
>> which is the equivalent to xs_tcp_state_change? (if so, why, so
>> I can compose a reasonable patch description)
>> 
>> It could be bumped in the RDMA_CM_EVENT_ESTABLISHED and the
>> RDMA_CM_EVENT_DISCONNECTED cases, for example. I'm not sure
>> RDMA provides a distinction between "server disconnected"
>> and "client disconnected" although that probably does not
>> matter for this purpose.
>> 
>> But, why would the additional cookie update help? The transport
>> is not disconnecting before the deadlock.
>> 
> 
> The connection cookie's purpose is twofold:
> 
> 1) It tracks whether or not a request has been transmitted on the
> current connection or not.

That's broken by setting the cookie unconditionally outside
the transport_lock, isn't it?


> 2) It ensures that when several requests with the same connection
> cookie all call xprt_conditional_disconnect(), then that results in a
> single disconnection event. To do so, it assumes that xprt_autoclose()
> will change the cookie if the disconnection attempt is successful.
> 
> In TCP we do so in the xs_tcp_state_change(). If the RDMA transport can
> guarantee that the call to xprt->ops->close(xprt) is always successful,
> then you could do so there.

I don't mind moving the cookie bump to rpcrdma_conn_upcall,
but I'm not sure I understand the locking requirements.

Currently, xprt_transmit sets the connect_cookie while holding
the transport_lock.

xprt_conditional_disconnect compares the cookie while holding
the transport_lock.

For TCP, the transport_lock is held when bumping the cookie
in the ESTABLISHED case, but _not_ in the two CLOSE cases?

xprtrdma holds the transport_lock when bumping the cookie,
which it does in its connect worker. It has to hold the lock
because it skips the value 0. xprtrdma needs to guarantee
that an RPC is never transmitted on the same connection
twice (and maybe it could use rq_connect_cookie instead of
its own cookie).

xprt_reserve_init is holding the reserve_lock but not the
transport_lock when it grabs the cookie. Maybe it should
not be initializing the rqst's cookie there?

Seems to me that xprt_transmit needs to update the rqst's
cookie while holding the transport_lock, especially if
xprtrdma needs to skip a cookie value? I'm sure I'm missing
something.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux