> On May 27, 2024, at 1:15 AM, gaurav gangalwar <gaurav.gangalwar@xxxxxxxxx> wrote: > > Hi, > Facing one more issue while using referrals with RDMA > If RDMA is enabled and supported on both client and server and If we > mount parent with TCP. Then referral/submount mount will be done over > RDMA instead of TCP, since for referral/submount mount the client > tries with RDMA first and then TCP only of RDMA connections fails. > > As we can see here parent /home .160, mounted with tcp, t1 is > referral mount mounted with rdma > >> /root/tcp-mnt1 from 10.53.87.160:/home >> Flags: rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.53.87.158,local_lock=none,addr=10.53.87.160 >> >> /root/tcp-mnt1/t1 from 10.53.87.157:/:home/t1 >> Flags: rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=10.53.87.158,local_lock=none,addr=10.53.87.157 > > > Code which tries RDMA first, can we get transport type from parent and > use the same? You could. I was going for what Solaris does -- it tries to mount a referral with RDMA first. So I can understand your usage scenario, why do you not want to use RDMA for either the parent mount or the submount? >> #if IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) >> >> rpc_set_port(&ctx->nfs_server.address, NFS_RDMA_PORT); >> >> error = nfs4_set_client(server, >> >> ctx->nfs_server.hostname, >> >> &ctx->nfs_server._address, >> >> ctx->nfs_server.addrlen, >> >> parent_client->cl_ipaddr, >> >> XPRT_TRANSPORT_RDMA, >> >> parent_server->client->cl_timeout, >> >> parent_client->cl_mvops->minor_version, >> >> parent_client->cl_nconnect, >> >> parent_client->cl_max_connect, >> >> parent_client->cl_net, >> >> &parent_client->cl_xprtsec); >> >> if (!error) >> >> goto init_server; >> >> #endif /* IS_ENABLED(CONFIG_SUNRPC_XPRT_RDMA) */ >> >> > > Regards, > Gaurav Gangalwar > > On Tue, Feb 27, 2024 at 6:22 PM gaurav gangalwar > <gaurav.gangalwar@xxxxxxxxx> wrote: >> >> One more issue with referral code is there is no retry on connection failure >> >>> Feb 26 01:49:32 rbt-el7-1 kernel: nfs_create_rpc_client: cannot create RPC client. Error = -111 >>> Feb 26 01:49:32 rbt-el7-1 kernel: NFS4: Couldn't follow remote path >>> Feb 26 01:49:32 rbt-el7-1 kernel: <-- nfs4_get_referral_tree() = -111 [error] >> >> >> I was expecting retries from the client if submount fails if it's a hard mount on parent, but it fails submount. >> I can understand we will be stuck in a loop if fs info is not valid, then connection will always fail. >> >> Regards, >> Gaurav Gangalwar >> >> On Mon, Feb 12, 2024 at 7:23 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: >>> >>> >>> >>>> On Feb 12, 2024, at 12:51 AM, gaurav gangalwar <gaurav.gangalwar@xxxxxxxxx> wrote: >>>> >>>> I think I was using an older kernel version on a client which doesn't have your fix. >>>> I tried with the newer version v5.10, it worked fine. >>>> >>>> The only issue I see is we are not inheriting port from the parent in nfs4_create_referral_server, so even if I use port=20047 in mount it will try referral submount on 20049 only. >>>> >>>> rpc_set_port(data->addr, NFS_RDMA_PORT); >>>> >>>> We could inherit this also from parent? >>> >>> The client is supposed to use the port number information contained >>> in the referral. There's nothing that mandates that the two servers >>> will use the same alternate port. >>> >>> Using a constant here is probably wrong for both the TCP and RDMA >>> cases, though. >>> >>> >>> -- >>> Chuck Lever >>> >>> > -- Chuck Lever