> > On Mar 23, 2021, at 1:46 AM, Nagendra Tomar > <Nagendra.Tomar@xxxxxxxxxxxxx> wrote: > > > > From: Nagendra S Tomar <natomar@xxxxxxxxxxxxx> > > > > If a clustered NFS server is behind an L4 loadbalancer the default > > nconnect roundrobin policy may cause RPC requests to a file to be > > sent to different cluster nodes. This is because the source port > > would be different for all the nconnect connections. > > While this should functionally work (since the cluster will usually > > have a consistent view irrespective of which node is serving the > > request), it may not be desirable from performance pov. As an > > example we have an NFSv3 frontend to our Object store, where every > > NFSv3 file is an object. Now if writes to the same file are sent > > roundrobin to different cluster nodes, the writes become very > > inefficient due to the consistency requirement for object update > > being done from different nodes. > > Similarly each node may maintain some kind of cache to serve the file > > data/metadata requests faster and even in that case it helps to have > > a xprt affinity for a file/dir. > > In general we have seen such scheme to scale very well. > > > > This patch introduces a new rpc_xprt_iter_ops for using an additional > > u32 (filehandle hash) to affine RPCs to the same file to one xprt. > > It adds a new mount option "ncpolicy=roundrobin|hash" which can be > > used to select the nconnect multipath policy for a given mount and > > pass the selected policy to the RPC client. > > This sets off my "not another administrative knob that has > to be tested and maintained, and can be abused" allergy. > > Also, my "because connections are shared by mounts of the same > server, all those mounts will all adopt this behavior" rhinitis. Yes, it's fair to call this out, but ncpolicy behaves like the nconnect parameter in this regards. > And my "why add a new feature to a legacy NFS version" hives. > > > I agree that your scenario can and should be addressed somehow. > I'd really rather see this done with pNFS. > > Since you are proposing patches against the upstream NFS client, > I presume all your clients /can/ support NFSv4.1+. It's the NFS > servers that are stuck on NFSv3, correct? Yes. > > The flexfiles layout can handle an NFSv4.1 client and NFSv3 data > servers. In fact it was designed for exactly this kind of mix of > NFS versions. > > No client code change will be necessary -- there are a lot more > clients than servers. The MDS can be made to work smartly in > concert with the load balancer, over time; or it can adopt other > clever strategies. > > IMHO pNFS is the better long-term strategy here. The fundamental difference here is that the clustered NFSv3 server is available over a single virtual IP, so IIUC even if we were to use NFSv41 with flexfiles layout, all it can handover to the client is that single (load-balanced) virtual IP and now when the clients do connect to the NFSv3 DS we still have the same issue. Am I understanding you right? Can you pls elaborate what you mean by "MDS can be made to work smartly in concert with the load balancer"? > > > It adds a new rpc_procinfo member p_fhhash, which can be supplied > > by the specific RPC programs to return a u32 hash of the file/dir the > > RPC is targetting, and lastly it provides p_fhhash implementation > > for various NFS v3/v4/v41/v42 RPCs to generate the hash correctly. > > > > Thoughts? > > > > Thanks, > > Tomar > > > > Nagendra S Tomar (5): > > SUNRPC: Add a new multipath xprt policy for xprt selection based > > on target filehandle hash > > SUNRPC/NFSv3/NFSv4: Introduce "enum ncpolicy" to represent the > nconnect > > policy and pass it down from mount option to rpc layer > > SUNRPC/NFSv4: Rename RPC_TASK_NO_ROUND_ROBIN -> > RPC_TASK_USE_MAIN_XPRT > > NFSv3: Add hash computation methods for NFSv3 RPCs > > NFSv4: Add hash computation methods for NFSv4/NFSv42 RPCs > > > > fs/nfs/client.c | 3 + > > fs/nfs/fs_context.c | 26 ++ > > fs/nfs/internal.h | 2 + > > fs/nfs/nfs3client.c | 4 +- > > fs/nfs/nfs3xdr.c | 154 +++++++++++ > > fs/nfs/nfs42xdr.c | 112 ++++++++ > > fs/nfs/nfs4client.c | 14 +- > > fs/nfs/nfs4proc.c | 18 +- > > fs/nfs/nfs4xdr.c | 516 ++++++++++++++++++++++++++++++----- > > fs/nfs/super.c | 7 +- > > include/linux/nfs_fs_sb.h | 1 + > > include/linux/sunrpc/clnt.h | 15 + > > include/linux/sunrpc/sched.h | 2 +- > > include/linux/sunrpc/xprtmultipath.h | 9 +- > > include/trace/events/sunrpc.h | 4 +- > > net/sunrpc/clnt.c | 38 ++- > > net/sunrpc/xprtmultipath.c | 91 +++++- > > 17 files changed, 913 insertions(+), 103 deletions(-) > > -- > Chuck Lever > >