Re: [for-6.13 PATCH 10/19] nfs_common: move localio_lock to new lock member of nfs_uuid_t

Mike Snitzer <snitzer@xxxxxxxxxx> · Mon, 11 Nov 2024 19:16:50 -0500

On Tue, Nov 12, 2024 at 10:23:19AM +1100, NeilBrown wrote:
> On Tue, 12 Nov 2024, Mike Snitzer wrote:
> > On Tue, Nov 12, 2024 at 07:35:04AM +1100, NeilBrown wrote:
> > > 
> > > I don't like NFS_CS_LOCAL_IO_CAPABLE.
> > > A use case that I imagine (and a customer does something like this) is an
> > > HA cluster where the NFS server can move from one node to another.  All
> > > the node access the filesystem, most over NFS.  If a server-migration
> > > happens (e.g.  the current server node failed) then the new server node
> > > would suddenly become LOCALIO-capable even though it wasn't at
> > > mount-time.  I would like it to be able to detect this and start doing
> > > localio.
> > 
> > Server migration while retaining the client being local to the new
> > server?  So client migrates too?
> 
> No.  Client doesn't migrate.  Server migrates and appears on the same
> host as the client.  The client can suddenly get better performance.  It
> should benefit from that.
> 
> > 
> > If the client migrates then it will negotiate with server using
> > LOCALIO protocol.
> > 
> > Anyway, this HA hypothetical feels contrived.  It is fine that you
> > dislike NFS_CS_LOCAL_IO_CAPABLE but I don't understand what you'd like
> > as an alternative.  Or why the simplicity in my approach lacking.
> 
> We have customers with exactly this HA config.  This is why I put work
> into make sure loop-back NFS (client and server on same node) works
> cleanly without memory allocation deadlocks.
>   https://lwn.net/Articles/595652/
> Getting localio in that config would be even better.
> 
> Your approach assumes that if LOCALIO isn't detected at mount time, it
> will never been available.  I think that is a flawed assumption.

That's fair, I agree your HA scenario is valid.  It was terse as
initially presented but I understand now, thanks.

> > > So I don't want NFS_CS_LOCAL_IO_CAPABLE.  I think tracking when the
> > > network connection is re-established is sufficient.
> > 
> > Eh, that type of tracking doesn't really buy me anything if I've lost
> > context (that LOCALIO was previously established and should be
> > re-established).
> > 
> > NFS v3 is stateless, hence my hooking off read and write paths to
> > trigger nfs_local_probe_async().  Unlike NFS v4, with its grace, more
> > care is needed to avoid needless calls to nfs_local_probe_async().
> 
> I think it makes perfect sense to trigger the probe on a successful
> read/write with some rate limiting to avoid sending a LOCALIO probe on
> EVERY read/write.  Your rate-limiting for NFSv3 is:
>    - never probe if the mount-time probe was not successful
>    - otherwise probe once every 256 IO requests.
> 
> I think the first is too restrictive, and the second is unnecessarily
> frequent.
> I propose:
>    - probe once each time the client reconnects with the server
> 
> This will result in many fewer probes in practice, but any successful
> probe will happen at nearly the earliest possible moment.

I'm all for what you're proposing (its what I wanted from the start).
In practice I just don't quite grok the client reconnect awareness
implementation you're saying is at our finger tips.

> > Your previous email about just tracking network connection change was
> > an optimization for avoiding repeat (pointless) probes.  We still
> > need to know to do the probe to begin with.  Are you saying you want
> > to backfill the equivalent of grace (or pseudo-grace) to NFSv3?
> 
> You don't "know to do the probe" at mount time.  You simply always do
> it.  Similarly whenever localio isn't active it is always appropriate to
> probe - with rate limiting.
> 
> And NFSv3 already has a grace period - in the NLM/STAT protocols.  We
> could use STAT to detect when the server has restarted and so it is worth
> probing again.  But STAT is not as reliable as we might like and it
> would be more complexity with no real gain.

If you have a specific idea for the mechanism we need to create to
detect the v3 client reconnects to the server please let me know.
Reusing or augmenting an existing thing is fine by me.

> I would be happy to use exactly the same mechanism for both v3 and v4:
> send a probe after IO on a new connection.  But your solution for v4 is
> simple and elegant so I'm not at all against it.
> 
> > 
> > My approach works.  Not following what you are saying will be better.
> 
>  - server-migration can benefit from localio on the new host
>  - many fewer probes
>  - probes are much more timely.

Ha, you misunderstood me: I know the benefits.. what eludes me is the
construction of the point to probe (reliable v3 client reconnect
awareness). ;)

Mike