On Fri, 2020-04-17 at 14:08 -0400, Olga Kornievskaia wrote: > On Fri, Apr 17, 2020 at 12:53 PM Trond Myklebust > <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Fri, 2020-04-17 at 12:46 -0400, Olga Kornievskaia wrote: > > > On Fri, Apr 17, 2020 at 12:20 PM Trond Myklebust > > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Fri, 2020-04-17 at 11:43 -0400, Olga Kornievskaia wrote: > > > > > Hi Trond, > > > > > > > > > > On Fri, Apr 17, 2020 at 11:31 AM Trond Myklebust > > > > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > Hi Olga, > > > > > > > > > > > > On Fri, 2020-04-17 at 11:15 -0400, Olga Kornievskaia wrote: > > > > > > > When nconnect is used, SEQUENCE operation currently isn't > > > > > > > bound > > > > > > > to > > > > > > > a particular transport. The problem is created on an idle > > > > > > > mount, > > > > > > > where SEQUENCE is the only operation being sent and > > > > > > > opened > > > > > > > TPC > > > > > > > connections are slowly being close from the lack of use. > > > > > > > If > > > > > > > SEQUENCE > > > > > > > is not assigned to the main connection, the main > > > > > > > connection > > > > > > > can > > > > > > > be closed and with that so is the back channel bound to > > > > > > > that > > > > > > > connection. > > > > > > > > > > > > > > Since the only way client handles callback_path down is > > > > > > > by > > > > > > > sending > > > > > > > BIND_CONN_TO_SESSION requesting to bind both backchannel > > > > > > > and > > > > > > > fore > > > > > > > channel on the connection that was left going, but that > > > > > > > connection > > > > > > > was already bound to only forechannel. According to the > > > > > > > spec, > > > > > > > it's > > > > > > > not allowed to change channel binding after they are > > > > > > > done. > > > > > > > > > > > > > > The fix is to make sure that a lone SEQUENCE always goes > > > > > > > on > > > > > > > the > > > > > > > main connection, keeping backchannel alive. > > > > > > > > > > > > > > Fixes: 5a0c257f8 ("NFS: send state management on a single > > > > > > > connection") > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx> > > > > > > > --- > > > > > > > fs/nfs/nfs4proc.c | 2 +- > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > > > > > > > index 99e9f2e..461f85d 100644 > > > > > > > --- a/fs/nfs/nfs4proc.c > > > > > > > +++ b/fs/nfs/nfs4proc.c > > > > > > > @@ -8857,7 +8857,7 @@ static struct rpc_task > > > > > > > *_nfs41_proc_sequence(struct nfs_client *clp, > > > > > > > .rpc_client = clp->cl_rpcclient, > > > > > > > .rpc_message = &msg, > > > > > > > .callback_ops = &nfs41_sequence_ops, > > > > > > > - .flags = RPC_TASK_ASYNC | RPC_TASK_TIMEOUT, > > > > > > > + .flags = RPC_TASK_ASYNC | RPC_TASK_TIMEOUT > > > > > > > | > > > > > > > RPC_TASK_NO_ROUND_ROBIN, > > > > > > > }; > > > > > > > struct rpc_task *ret; > > > > > > > > > > > > > > > > > > > This works only in the case where the client is only > > > > > > sending > > > > > > SEQUENCE > > > > > > instructions. There are other cases where it could be > > > > > > sending > > > > > > out > > > > > > other > > > > > > operations that also renew the lease, but is doing it very > > > > > > infrequently. Won't that also run into the same problem? > > > > > > > > > > Hm... I see so main channel can still go idle and close, when > > > > > infrequent operations are happening on the other connections > > > > > before > > > > > round-robin-ing to the main connection.... > > > > > > > > > > > Is the fundamental problem here that we're not handling the > > > > > > SEQ4_STATUS_CB_PATH_DOWN / SEQ4_STATUS_CB_PATH_DOWN_SESSION > > > > > > flags > > > > > > correctly or is there something else going on? > > > > > > > > > > Yes the client doesn't recover properly. But the fix wasn't > > > > > trivial > > > > > to > > > > > me (so I thought my patch was enough but I see it's not). Say > > > > > client > > > > > shuts down the main connection because it was idle. Now > > > > > whatever > > > > > operations goes on a different connection is going to get > > > > > callback > > > > > down. The only way the client can create a new backchannel > > > > > (according > > > > > to the spec) is if it creates a brand new connection and > > > > > sends > > > > > BIND_CONN_TO_SESSION there (all existing connections are > > > > > already > > > > > bound > > > > > to fore channel and according to the spec you can't modify > > > > > the > > > > > existing binding). But then we'd need to make sure that it's > > > > > the > > > > > first > > > > > one in the list of connections we iterate thru (as i think > > > > > 1st > > > > > marks > > > > > the main connection?) as the other operations that supposed > > > > > to > > > > > only > > > > > go > > > > > on main connection need to know which connection to pick. > > > > > > > > > > The reason it's not seen against linux is because it doesn't > > > > > follow > > > > > the spec is doesn't reject attempts to bind a backchannel to > > > > > an > > > > > already existing connection that was only bound for fore > > > > > channel. > > > > > > > > > > > > > Oh, I see. So the server is replying NFS4ERR_INVAL in order to > > > > let > > > > the > > > > client know that it is trying to change the channel bindings > > > > for > > > > that > > > > connection. > > > > > > Well server isn't failing because client is asking for > > > FORE_OR_BOTH > > > and it's a choice so server is returning FORE. I'm not sure we > > > can > > > ask > > > the server to fail the request with ERR_INVAL.... (rather I can > > > ask > > > but ) rather can we expect the server to do that always? > > > > In RFC5661, Section 18.34.3 I found the following normative text: > > > > Invoking BIND_CONN_TO_SESSION on a connection already associated > > with > > the specified session has no effect, and the server MUST respond > > with > > NFS4_OK, unless the client is demanding changes to the set of > > channels the connection is associated with. If so, the server > > MUST > > return NFS4ERR_INVAL. > > > > > > IOW: it sounds like your server isn't following the spec either. > > Thank you thank you! I missed that. Ok so we'll assume it does and I > will try to come up with what you are suggesting... > > Do we need to do reorder the connection list? After we reset the > connection, it will be on some random connection on the list of > connections. But for the NO_ROBIN connections we pick the first > connection in the list (which will not be the backchannel > connection). > Does it matter? That's why I suggested the rpc_task_close_connection() wrapper below. It should ensure that we close the same connection that got the NFS4ERR_INVAL, and then when we restart the RPC call (using rpc_restart_call_prepare()) it will just automatically create a new connection on that same struct xprt... > > > > > Hmm... Is there any reason why we can't just add a handler to > > > > nfs4_bind_one_conn_to_session_done() that intercepts > > > > NFS4ERR_INVAL, > > > > and > > > > disconnects the xprt before retrying? > > > > We should probably add a wrapper to xprt_force_disconnect() in > > > > include/linux/sunrpc/clnt.h. Something like the following? > > > > > > > > > > > > static inline void rpc_task_close_connection(struct rpc_task > > > > *task) > > > > { > > > > if (task->tk_xprt) > > > > xprt_force_disconnect(task->tk_xprt); > > > > } > > > > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx