On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust < > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote: > > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < > > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > > olga.kornievskaia@xxxxxxxxx> wrote: > > > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > > chuck.lever@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > > olga.kornievskaia@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@xxxxxxxxxx> > > > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > > add the transport to the old client structure before > > > > > > > > > destroying > > > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx> > > > > > > > > > --- > > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c > > > > > > > > > b/fs/nfs/nfs4client.c > > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client > > > > > > > > > *clp) > > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > > } > > > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > > > struct > > > > > > > > > nfs_client *old) > > > > > > > > > +{ > > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > > *)&clp_addr; > > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > > *)&old_addr; > > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > > + .ident = old->cl_proto, > > > > > > > > > + .net = old->cl_net, > > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > > + }; > > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > > + .clp = old, > > > > > > > > > + }; > > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > > > session_trunk, > > > > > > > > > + .data = &xprtdata, > > > > > > > > > + }; > > > > > > > > > + > > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > > + return; > > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > > clp_sap, > > > > > > > > > sizeof(clp_addr)); > > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > > old_sap, > > > > > > > > > sizeof(old_addr)); > > > > > > > > > + > > > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > > > + return; > > > > > > > > > + > > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > > + > > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, > > > > > > > > > &xprt_args, > > > > > > > > > + > > > > > > > > > rpc_clnt_setup_test_and_add_xprt, > > > > > > > > > &rpcdata); > > > > > > > > > > > > > > > > Is there an upper bound on the number of transports > > > > > > > > that > > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > > > I don't believe any limits exist right now. Why should > > > > > > > there > > > > > > > be a > > > > > > > limit? Are you saying that the client should limit > > > > > > > trunking? > > > > > > > While > > > > > > > this is not what's happening here, but say FS_LOCATION > > > > > > > returned > > > > > > > 100 > > > > > > > ips for the server. Are you saying the client should be > > > > > > > limiting > > > > > > > how > > > > > > > many trunkable connections it would be establishing and > > > > > > > picking > > > > > > > just a > > > > > > > few addresses to try? What's happening with this patch is > > > > > > > that > > > > > > > say > > > > > > > there are 100mounts to 100 ips (each representing the > > > > > > > same > > > > > > > server > > > > > > > or > > > > > > > trunkable server(s)), without this patch a single > > > > > > > connection > > > > > > > is > > > > > > > kept, > > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > > > The patch description needs to make this behavior more > > > > > > clear. > > > > > > It > > > > > > needs to explain "why" -- the body of the patch already > > > > > > explains > > > > > > "what". Can you include your last sentence in the > > > > > > description > > > > > > as > > > > > > a use case? > > > > > > > > > > > > As for the behavior... Seems to me that there are going to > > > > > > be > > > > > > only > > > > > > infinitesimal gains after the first few connections, and > > > > > > after > > > > > > that, it's going to be a lot for both sides to manage for > > > > > > no > > > > > > real > > > > > > gain. I think you do want to cap the total number of > > > > > > connections > > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the > > > > > way > > > > > to > > > > > go, > > > > > not vast numbers of connections to a single server. > > > > > > > > > BTW: AFAICS this patch will end up adding another connection > > > > every > > > > time > > > > we mount a new filesystem, whether or not a connection already > > > > exists > > > > to that IP address. That's unacceptable. > > > > > > Not in my testing. Mounts to the same server (same IP) different > > > export volumes lead to use of a single connection. > > > > > > > > Ah... Never mind. The comparison is made by > > rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if > > it > > is already present. > > > > However why are you running trunking detection a second time on the > > address you've just run trunking detection on and have decided to > > add? > > Because that's where we determined that these are different clients > and are trunkable? But I guess also for the ease of re-using existing > code. There is no hook to create an xprt and add it to an existing > RPC > client. One can be created. Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx