RE: [PATCH 0/5] nfs: Add mount option for forcing RPC requests for one file over one connection

Nagendra Tomar <Nagendra.Tomar@xxxxxxxxxxxxx> · Tue, 23 Mar 2021 18:01:48 +0000

 > > On Mar 23, 2021, at 12:29 PM, Nagendra Tomar
> <Nagendra.Tomar@xxxxxxxxxxxxx> wrote:
> >
> >>> On Mar 23, 2021, at 11:57 AM, Nagendra Tomar
> >> <Nagendra.Tomar@xxxxxxxxxxxxx> wrote:
> >>>
> >>>>> On Mar 23, 2021, at 1:46 AM, Nagendra Tomar
> >>>> <Nagendra.Tomar@xxxxxxxxxxxxx> wrote:
> >>>>>
> >>>>> From: Nagendra S Tomar <natomar@xxxxxxxxxxxxx>
> >>>>
> >>>> The flexfiles layout can handle an NFSv4.1 client and NFSv3 data
> >>>> servers. In fact it was designed for exactly this kind of mix of
> >>>> NFS versions.
> >>>>
> >>>> No client code change will be necessary -- there are a lot more
> >>>> clients than servers. The MDS can be made to work smartly in
> >>>> concert with the load balancer, over time; or it can adopt other
> >>>> clever strategies.
> >>>>
> >>>> IMHO pNFS is the better long-term strategy here.
> >>>
> >>> The fundamental difference here is that the clustered NFSv3 server
> >>> is available over a single virtual IP, so IIUC even if we were to use
> >>> NFSv41 with flexfiles layout, all it can handover to the client is that single
> >>> (load-balanced) virtual IP and now when the clients do connect to the
> >>> NFSv3 DS we still have the same issue. Am I understanding you right?
> >>> Can you pls elaborate what you mean by "MDS can be made to work
> >>> smartly in concert with the load balancer"?
> >>
> >> I had thought there were multiple NFSv3 server targets in play.
> >>
> >> If the load balancer is making them look like a single IP address,
> >> then take it out of the equation: expose all the NFSv3 servers to
> >> the clients and let the MDS direct operations to each data server.
> >>
> >> AIUI this is the approach (without the use of NFSv3) taken by
> >> NetApp next generation clusters.
> >
> > Yeah, if could have clients access all the NFSv3 servers then I agree, pNFS
> > would be a viable option. Unfortunately that's not an option in this case. The
> > cluster has 100's of nodes and it's not an on-prem server, but a cloud service,
> > so the simplicity of the single LB VIP is critical.
> 
> The clients mount only the MDS. The MDS provides the DS addresses, they are
> not exposed to client administrators. If the MDS adopts the load balancer's IP
> address, then the clients would simply mount that same server address using
> NFSv4.1.

I understand/agree with the "client mounts the single MDS IP" part. What I meant 
by "simplicity of the single LB VIP" is to not having to have so many routable 
IP addresses, since the clients could be on a (very) different network than the 
storage cluster they are accessing, even though client admins will not deal with
those addresses themselves, as you mention.

> 
> The other alternative is to make the load balancer sniff the FH from each
> NFS request and direct it to a consistent NFSv3 DS. I still prefer that
> over adding a very special-case mount option to the Linux client. Again,
> you'd be deploying a code change in one place, under your control, instead
> of on 100's of clients.

That is one option but that makes LB application aware and potentially less 
performant. Appreciate your suggestion, though!
I was hoping that such a client side change could be useful to possibly more 
users with similar setups, after all file->connection affinity doesn't sound too 
arcane and one can think of benefits of one node processing one file. No?

> 
> 
> --
> Chuck Lever
> 
>