Re: [for-6.13 PATCH 10/19] nfs_common: move localio_lock to new lock member of nfs_uuid_t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 12, 2024 at 07:35:04AM +1100, NeilBrown wrote:
> On Tue, 12 Nov 2024, Mike Snitzer wrote:
> > On Mon, Nov 11, 2024 at 12:55:24PM +1100, NeilBrown wrote:
> > > On Sat, 09 Nov 2024, Mike Snitzer wrote:
> > > > Remove cl_localio_lock from 'struct nfs_client' in favor of adding a
> > > > lock to the nfs_uuid_t struct (which is embedded in each nfs_client).
> > > > 
> > > > Push nfs_local_{enable,disable} implementation down to nfs_common.
> > > > Those methods now call nfs_localio_{enable,disable}_client.
> > > > 
> > > > This allows implementing nfs_localio_invalidate_clients in terms of
> > > > nfs_localio_disable_client.
> > > > 
> > > > Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> > > > ---
> > > >  fs/nfs/client.c            |  1 -
> > > >  fs/nfs/localio.c           | 18 ++++++------
> > > >  fs/nfs_common/nfslocalio.c | 57 ++++++++++++++++++++++++++------------
> > > >  include/linux/nfs_fs_sb.h  |  1 -
> > > >  include/linux/nfslocalio.h |  8 +++++-
> > > >  5 files changed, 55 insertions(+), 30 deletions(-)
> > > > 
> > > > diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> > > > index 03ecc7765615..124232054807 100644
> > > > --- a/fs/nfs/client.c
> > > > +++ b/fs/nfs/client.c
> > > > @@ -182,7 +182,6 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init)
> > > >  	seqlock_init(&clp->cl_boot_lock);
> > > >  	ktime_get_real_ts64(&clp->cl_nfssvc_boot);
> > > >  	nfs_uuid_init(&clp->cl_uuid);
> > > > -	spin_lock_init(&clp->cl_localio_lock);
> > > >  #endif /* CONFIG_NFS_LOCALIO */
> > > >  
> > > >  	clp->cl_principal = "*";
> > > > diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
> > > > index cab2a8819259..4c75ffc5efa2 100644
> > > > --- a/fs/nfs/localio.c
> > > > +++ b/fs/nfs/localio.c
> > > > @@ -125,10 +125,8 @@ const struct rpc_program nfslocalio_program = {
> > > >   */
> > > >  static void nfs_local_enable(struct nfs_client *clp)
> > > >  {
> > > > -	spin_lock(&clp->cl_localio_lock);
> > > > -	set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
> > > >  	trace_nfs_local_enable(clp);
> > > > -	spin_unlock(&clp->cl_localio_lock);
> > > > +	nfs_localio_enable_client(clp);
> > > >  }
> > > 
> > > Why do we need this function?  The one caller could call
> > > nfs_localio_enable_client() directly instead.  The tracepoint could be
> > > placed in that one caller.
> > 
> > Yeah, I saw that too but felt it useful to differentiate between calls
> > that occur during NFS client initialization and those that happen as a
> > side-effect of callers from other contexts (in later patch this
> > manifests as nfs_localio_disable_client vs nfs_local_disable).
> > 
> > Hence my adding secondary tracepoints for nfs_common (see "[PATCH
> > 17/19] nfs_common: add nfs_localio trace events).
> > 
> > But sure, we can just eliminate nfs_local_{enable,disable} and the
> > corresponding tracepoints (which will have moved down to nfs_common).
> 
> I don't feel strongly about this.  If you think these is value in these
> wrapper functions then I won't argue.  As a general rule I don't like
> multiple interfaces that do (much) the same thing as keeping track of
> them increases the mental load.
> 
> > 
> > > >  /*
> > > > @@ -136,12 +134,8 @@ static void nfs_local_enable(struct nfs_client *clp)
> > > >   */
> > > >  void nfs_local_disable(struct nfs_client *clp)
> > > >  {
> > > > -	spin_lock(&clp->cl_localio_lock);
> > > > -	if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) {
> > > > -		trace_nfs_local_disable(clp);
> > > > -		nfs_localio_disable_client(&clp->cl_uuid);
> > > > -	}
> > > > -	spin_unlock(&clp->cl_localio_lock);
> > > > +	trace_nfs_local_disable(clp);
> > > > +	nfs_localio_disable_client(clp);
> > > >  }
> > > 
> > > Ditto.  Though there are more callers so the tracepoint solution isn't
> > > quite so obvious.
> > 
> > Right... as I just explained: that's why I preserved nfs_local_disable
> > (and the tracepoint).
> > 
> > 
> > > >  /*
> > > > @@ -183,8 +177,12 @@ static bool nfs_server_uuid_is_local(struct nfs_client *clp)
> > > >  	rpc_shutdown_client(rpcclient_localio);
> > > >  
> > > >  	/* Server is only local if it initialized required struct members */
> > > > -	if (status || !clp->cl_uuid.net || !clp->cl_uuid.dom)
> > > > +	rcu_read_lock();
> > > > +	if (status || !rcu_access_pointer(clp->cl_uuid.net) || !clp->cl_uuid.dom) {
> > > > +		rcu_read_unlock();
> > > >  		return false;
> > > > +	}
> > > > +	rcu_read_unlock();
> > > 
> > > What value does RCU provide here?  I don't think this change is needed.
> > > rcu_access_pointer does not require rcu_read_lock().
> > 
> > OK, not sure why I though RCU read-side needed for rcu_access_pointer()...
> > 
> > > >  	return true;
> > > >  }
> > > > diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
> > > > index 904439e4bb85..cf2f47ea4f8d 100644
> > > > --- a/fs/nfs_common/nfslocalio.c
> > > > +++ b/fs/nfs_common/nfslocalio.c
> > > > @@ -7,6 +7,9 @@
> > > >  #include <linux/module.h>
> > > >  #include <linux/list.h>
> > > >  #include <linux/nfslocalio.h>
> > > > +#include <linux/nfs3.h>
> > > > +#include <linux/nfs4.h>
> > > > +#include <linux/nfs_fs_sb.h>
> > > 
> > > I don't feel good about adding this nfs client knowledge in to nfs_common.
> > 
> > I hear you.. I was "OK with it".
> > 
> > > >  #include <net/netns/generic.h>
> > > >  
> > > >  MODULE_LICENSE("GPL");
> > > > @@ -25,6 +28,7 @@ void nfs_uuid_init(nfs_uuid_t *nfs_uuid)
> > > >  	nfs_uuid->net = NULL;
> > > >  	nfs_uuid->dom = NULL;
> > > >  	INIT_LIST_HEAD(&nfs_uuid->list);
> > > > +	spin_lock_init(&nfs_uuid->lock);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(nfs_uuid_init);
> > > >  
> > > > @@ -94,12 +98,23 @@ void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(nfs_uuid_is_local);
> > > >  
> > > > +void nfs_localio_enable_client(struct nfs_client *clp)
> > > > +{
> > > > +	nfs_uuid_t *nfs_uuid = &clp->cl_uuid;
> > > > +
> > > > +	spin_lock(&nfs_uuid->lock);
> > > > +	set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
> > > > +	spin_unlock(&nfs_uuid->lock);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(nfs_localio_enable_client);
> > > 
> > > And I don't feel good about nfs_local accessing nfs_client directly.
> > > It only uses it for NFS_CS_LOCAL_IO.  Can we ditch that flag and instead
> > > so something like testing nfs_uuid.net ??
> > 
> > That'd probably be OK for the equivalent of NFS_CS_LOCAL_IO but the last
> > patch in this series ("nfs: probe for LOCALIO when v3 client
> > reconnects to server") adds NFS_CS_LOCAL_IO_CAPABLE to provide a hint
> > that the client and server successfully established themselves local
> > via LOCALIO protocol.  This is needed so that NFSv3 (stateless) has a
> > hint that reestablishing LOCALIO needed if/when the client loses
> > connectivity to the server (because it was shutdown and restarted).
> 
> I don't like NFS_CS_LOCAL_IO_CAPABLE.
> A use case that I imagine (and a customer does something like this) is an
> HA cluster where the NFS server can move from one node to another.  All
> the node access the filesystem, most over NFS.  If a server-migration
> happens (e.g.  the current server node failed) then the new server node
> would suddenly become LOCALIO-capable even though it wasn't at
> mount-time.  I would like it to be able to detect this and start doing
> localio.

Server migration while retaining the client being local to the new
server?  So client migrates too?

If the client migrates then it will negotiate with server using
LOCALIO protocol.

Anyway, this HA hypothetical feels contrived.  It is fine that you
dislike NFS_CS_LOCAL_IO_CAPABLE but I don't understand what you'd like
as an alternative.  Or why the simplicity in my approach lacking.

> So I don't want NFS_CS_LOCAL_IO_CAPABLE.  I think tracking when the
> network connection is re-established is sufficient.

Eh, that type of tracking doesn't really buy me anything if I've lost
context (that LOCALIO was previously established and should be
re-established).

NFS v3 is stateless, hence my hooking off read and write paths to
trigger nfs_local_probe_async().  Unlike NFS v4, with its grace, more
care is needed to avoid needless calls to nfs_local_probe_async().

Your previous email about just tracking network connection change was
an optimization for avoiding repeat (pointless) probes.  We still
need to know to do the probe to begin with.  Are you saying you want
to backfill the equivalent of grace (or pseudo-grace) to NFSv3?

My approach works.  Not following what you are saying will be better.

Thanks,
Mike




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux