On Thu, 2021-03-18 at 13:57 +0000, Chuck Lever III wrote: > > > > On Mar 17, 2021, at 9:56 PM, Nagendra Tomar < > > Nagendra.Tomar@xxxxxxxxxxxxx> wrote: > > > > We have a clustered NFS server behind a L4 load-balancer with the > > following > > Characteristics (relevant to this discussion): > > > > 1. RPC requests for the same file issued to different cluster nodes > > are not efficient. > > One file one cluster node is efficient. This is particularly > > true for WRITEs. > > 2. Multiple nconnect xprts land on different cluster nodes due to > > the source > > port being different for all. > > > > Due to this, the default nconnect roundrobin policy does not work > > very well as > > it results in RPCs targeted to the same file to be serviced by > > different cluster nodes. > > > > To solve this, we tweaked the nfs multipath code to always choose > > the same xprt > > for the same file. We do that by adding a new integer field to > > rpc_message, > > rpc_xprt_hint, which is set by NFS layer and used by RPC layer to > > pick a xprt. > > NFS layer sets it to the hash of the target file's filehandle, thus > > ensuring same file > > requests always use the same xprt. This works well. > > > > I am interested in knowing your thoughts on this, has anyone else > > also come across > > similar issue, is there any other way of solving this, etc. > > Would a pNFS file layout work? The MDS could direct I/O for > a particular file to a specific DS. That's the other option if your customers are using NFSv4.1 or NFSv4.2. That has the advantage that it also would allow the server to dynamically load balance the I/O across the available cluster nodes by recalling some layouts for nodes that are too hot and migrating them to nodes that have spare capacity. The file metadata and directory data+metadata will however still be retrieved from the node that the NFS client is mounting from. I don't know if that might still be a problem for this cluster setup? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx