Re: [RFC] nconnect xprt stickiness for a file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2021-03-18 at 13:57 +0000, Chuck Lever III wrote:
> 
> 
> > On Mar 17, 2021, at 9:56 PM, Nagendra Tomar <
> > Nagendra.Tomar@xxxxxxxxxxxxx> wrote:
> > 
> > We have a clustered NFS server behind a L4 load-balancer with the
> > following
> > Characteristics (relevant to this discussion):
> > 
> > 1. RPC requests for the same file issued to different cluster nodes
> > are not efficient.
> >    One file one cluster node is efficient. This is particularly
> > true for WRITEs.
> > 2. Multiple nconnect xprts land on different cluster nodes due to
> > the source 
> >    port being different for all.
> > 
> > Due to this, the default nconnect roundrobin policy does not work
> > very well as
> > it results in RPCs targeted to the same file to be serviced by
> > different cluster nodes.
> > 
> > To solve this, we tweaked the nfs multipath code to always choose
> > the same xprt 
> > for the same file. We do that by adding a new integer field to
> > rpc_message,
> > rpc_xprt_hint, which is set by NFS layer and used by RPC layer to
> > pick a xprt.
> > NFS layer sets it to the hash of the target file's filehandle, thus
> > ensuring same file
> > requests always use the same xprt. This works well.
> > 
> > I am interested in knowing your thoughts on this, has anyone else
> > also come across
> > similar issue, is there any other way of solving this, etc.
> 
> Would a pNFS file layout work? The MDS could direct I/O for
> a particular file to a specific DS.

That's the other option if your customers are using NFSv4.1 or NFSv4.2.

That has the advantage that it also would allow the server to
dynamically load balance the I/O across the available cluster nodes by
recalling some layouts for nodes that are too hot and migrating them to
nodes that have spare capacity.

The file metadata and directory data+metadata will however still be
retrieved from the node that the NFS client is mounting from. I don't
know if that might still be a problem for this cluster setup?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux