Seemingly innocent optimization related to xs_bind() broke TCP port reuse by making non-reserved ephermal socket port to not be saved in "struct sock_xprt (srcport)". In case of non-reserved port, allocation happens as part of kernel_connect() inside of xs_tcp_finish_connecting(). kernel_connect() returns EINPROGRESS and the code skips stashing srcport in sock_xprt for reconnects. This affects servers DRC in case of network partition where client's RPC recovery would try reconnecting with a different port. Reported-by: Alexey Kuznetsov <alexeyk@xxxxxxxxxx> Reviewed-by: Jacob Strauss <jsstraus@xxxxxxxxxx> Reviewed-by: Alakesh Haloi <alakeshh@xxxxxxxxxx> Signed-off-by: Vallish Vaidyeshwara <vallish@xxxxxxxxxx> Fixes: 0f7a622c ("rpc: xs_bind - do not bind when requesting a random ephemeral port") --- net/sunrpc/xprtsock.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index c8902f1..5bf75b3 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2393,9 +2393,11 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) ret = kernel_connect(sock, xs_addr(xprt), xprt->addrlen, O_NONBLOCK); switch (ret) { case 0: - xs_set_srcport(transport, sock); /* fall through */ case -EINPROGRESS: + /* Allocated port saved for reconnect */ + xs_set_srcport(transport, sock); + /* SYN_SENT! */ if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO) xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO; -- 2.7.3.AMZN -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html