Re: [PATCH v15 16/26] nfsd: add LOCALIO support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 06, 2024 at 11:04:16AM -0400, Mike Snitzer wrote:
> On Fri, Sep 06, 2024 at 09:34:08AM +1000, NeilBrown wrote:
> > On Fri, 06 Sep 2024, Mike Snitzer wrote:
> > > On Wed, Sep 04, 2024 at 09:47:07AM -0400, Chuck Lever wrote:
> > > > On Wed, Sep 04, 2024 at 03:01:46PM +1000, NeilBrown wrote:
> > > > > On Wed, 04 Sep 2024, NeilBrown wrote:
> > > > > > 
> > > > > > I agree that dropping and reclaiming a lock is an anti-pattern and in
> > > > > > best avoided in general.  I cannot see a better alternative in this
> > > > > > case.
> > > > > 
> > > > > It occurred to me what I should spell out the alternate that I DO see so
> > > > > you have the option of disagreeing with my assessment that it isn't
> > > > > "better".
> > > > > 
> > > > > We need RCU to call into nfsd, we need a per-cpu ref on the net (which
> > > > > we can only get inside nfsd) and NOT RCU to call
> > > > > nfsd_file_acquire_local().
> > > > > 
> > > > > The current code combines these (because they are only used together)
> > > > > and so the need to drop rcu. 
> > > > > 
> > > > > I thought briefly that it could simply drop rcu and leave it dropped
> > > > > (__releases(rcu)) but not only do I generally like that LESS than
> > > > > dropping and reclaiming, I think it would be buggy.  While in the nfsd
> > > > > module code we need to be holding either rcu or a ref on the server else
> > > > > the code could disappear out from under the CPU.  So if we exit without
> > > > > a ref on the server - which we do if nfsd_file_acquire_local() fails -
> > > > > then we need to reclaim RCU *before* dropping the ref.  So the current
> > > > > code is slightly buggy.
> > > > > 
> > > > > We could instead split the combined call into multiple nfs_to
> > > > > interfaces.
> > > > > 
> > > > > So nfs_open_local_fh() in nfs_common/nfslocalio.c would be something
> > > > > like:
> > > > > 
> > > > >  rcu_read_lock();
> > > > >  net = READ_ONCE(uuid->net);
> > > > >  if (!net || !nfs_to.get_net(net)) {
> > > > >        rcu_read_unlock();
> > > > >        return ERR_PTR(-ENXIO);
> > > > >  }
> > > > >  rcu_read_unlock();
> > > > >  localio = nfs_to.nfsd_open_local_fh(....);
> > > > >  if (IS_ERR(localio))
> > > > >        nfs_to.put_net(net);
> > > > >  return localio;
> > > > > 
> > > > > So we have 3 interfaces instead of 1, but no hidden unlock/lock.
> > > > 
> > > > Splitting up the function call occurred to me as well, but I didn't
> > > > come up with a specific bit of surgery. Thanks for the suggestion.
> > > > 
> > > > At this point, my concern is that we will lose your cogent
> > > > explanation of why the release/lock is done. Having it in email is
> > > > great, but email is more ephemeral than actually putting it in the
> > > > code.
> > > > 
> > > > 
> > > > > As I said, I don't think this is a net win, but reasonable people might
> > > > > disagree with me.
> > > > 
> > > > The "win" here is that it makes this code self-documenting and
> > > > somewhat less likely to be broken down the road by changes in and
> > > > around this area. Since I'm more forgetful these days I lean towards
> > > > the more obvious kinds of coding solutions. ;-)
> > > > 
> > > > Mike, how do you feel about the 3-interface suggestion?
> > > 
> > > I dislike expanding from 1 indirect function call to 2 in rapid
> > > succession (3 for the error path, not a problem, just being precise.
> > > But I otherwise like it.. maybe.. heh.
> > > 
> > > FYI, I did run with the suggestion to make nfs_to a pointer that just
> > > needs a simple assignment rather than memcpy to initialize.  So Neil's
> > > above code becames:
> > > 
> > >         rcu_read_lock();
> > >         net = rcu_dereference(uuid->net);
> > >         if (!net || !nfs_to->nfsd_serv_try_get(net)) {
> > >                 rcu_read_unlock();
> > >                 return ERR_PTR(-ENXIO);
> > >         }
> > >         rcu_read_unlock();
> > >         /* We have an implied reference to net thanks to nfsd_serv_try_get */
> > >         localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
> > >                                              cred, nfs_fh, fmode);
> > >         if (IS_ERR(localio))
> > >                 nfs_to->nfsd_serv_put(net);
> > >         return localio;
> > > 
> > > I do think it cleans the code up... full patch is here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next.v15-with-fixups&id=e85306941878a87070176702de687f2779436061
> > > 
> > > But I'm still on the fence.. someone help push me over!
> > 
> > I think the new code is unquestionable clearer, and not taking this
> > approach would be a micro-optimisation which would need to be
> > numerically justified.  So I'm pushing for the three-interface version
> > (despite what I said before).
> > 
> > Unfortunately the new code is not bug-free - not quite.
> > As soon as nfs_to->nfsd_serv_put() calls percpu_ref_put() the nfsd
> > module can be unloaded, and the "return" instruction might not be
> > present.  For this to go wrong would require a lot of bad luck, but if
> > the CPU took an interrupt at the wrong time were would be room.
> > 
> > [Ever since module_put_and_exit() was added (now ..and_kthread_exit)
> >  I've been sensitive to dropping the ref to a module in code running in
> >  the module]
> > 
> > So I think nfsd_serv_put (and nfsd_serv_try_get() __must_hold(RCU) and
> > nfs_open_local_fh() needs rcu_read_lock() before calling
> > nfs_to->nfsd_serv_put(net).
> 
> OK, yes I can see that, I implemented what you suggested at the end of
> your reply (see inline patch below)...
> 
> But I'd just like to point out that something like the below patch
> wouldn't be needed if we kept my "heavy" approach (nfs reference on
> nfsd modules via nfs_common using request_symbol):
> https://marc.info/?l=linux-nfs&m=172499445027800&w=2
> (that patch has stuff I since cleaned up, e.g. removed typedefs and
> EXPORT_SYMBOL_GPLs..)
> 
> I knew we were going to pay for being too cute with how nfs took its
> reference on nfsd.
> 
> So here we are, needing fiddly incremental fixes like this to close a
> really-small-yet-will-be-deadly race:

<snip required delicate rcu re-locking requirements patch>

I prefer this incremental re-implementation of my symbol_request patch
that eliminates all concerns about the validity of 'nfs_to' calls:

---
 fs/nfs/localio.c           |  5 +++
 fs/nfs_common/nfslocalio.c | 84 +++++++++++++++++++++++++++++++-------
 fs/nfsd/localio.c          |  2 +-
 include/linux/nfslocalio.h |  7 +++-
 4 files changed, 80 insertions(+), 18 deletions(-)

diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index c29cdf51c458..43520ac0fde8 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -124,6 +124,10 @@ const struct rpc_program nfslocalio_program = {
 static void nfs_local_enable(struct nfs_client *clp)
 {
 	spin_lock(&clp->cl_localio_lock);
+	if (!nfs_to_nfsd_localio_ops_get()) {
+		spin_unlock(&clp->cl_localio_lock);
+		return;
+	}
 	set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags);
 	trace_nfs_local_enable(clp);
 	spin_unlock(&clp->cl_localio_lock);
@@ -138,6 +142,7 @@ void nfs_local_disable(struct nfs_client *clp)
 	if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) {
 		trace_nfs_local_disable(clp);
 		nfs_uuid_invalidate_one_client(&clp->cl_uuid);
+		nfs_to_nfsd_localio_ops_put();
 	}
 	spin_unlock(&clp->cl_localio_lock);
 }
diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c
index 42b479b9191f..9039e0f1afa3 100644
--- a/fs/nfs_common/nfslocalio.c
+++ b/fs/nfs_common/nfslocalio.c
@@ -7,6 +7,7 @@
 #include <linux/module.h>
 #include <linux/rculist.h>
 #include <linux/nfslocalio.h>
+#include <linux/refcount.h>
 #include <net/netns/generic.h>
 
 MODULE_LICENSE("GPL");
@@ -53,11 +54,8 @@ static nfs_uuid_t * nfs_uuid_lookup_locked(const uuid_t *uuid)
 	return NULL;
 }
 
-static struct module *nfsd_mod;
-
 void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
-		       struct net *net, struct auth_domain *dom,
-		       struct module *mod)
+		       struct net *net, struct auth_domain *dom)
 {
 	nfs_uuid_t *nfs_uuid;
 
@@ -73,9 +71,6 @@ void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list,
 		 */
 		list_move(&nfs_uuid->list, list);
 		rcu_assign_pointer(nfs_uuid->net, net);
-
-		__module_get(mod);
-		nfsd_mod = mod;
 	}
 	spin_unlock(&nfs_uuid_lock);
 }
@@ -83,10 +78,8 @@ EXPORT_SYMBOL_GPL(nfs_uuid_is_local);
 
 static void nfs_uuid_put_locked(nfs_uuid_t *nfs_uuid)
 {
-	if (nfs_uuid->net) {
-		module_put(nfsd_mod);
-		nfs_uuid->net = NULL;
-	}
+	if (nfs_uuid->net)
+		RCU_INIT_POINTER(nfs_uuid->net, NULL);
 	if (nfs_uuid->dom) {
 		auth_domain_put(nfs_uuid->dom);
 		nfs_uuid->dom = NULL;
@@ -123,14 +116,14 @@ struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *uuid,
 	struct nfsd_file *localio;
 
 	/*
-	 * Not running in nfsd context, so must safely get reference on nfsd_serv.
+	 * NFS has a reference to NFSD and can safely make 'nfs_to' calls.
+	 *
+	 * But not running in NFSD context, so must safely get reference to nfsd_serv.
 	 * But the server may already be shutting down, if so disallow new localio.
+	 *
 	 * uuid->net is NOT a counted reference, but rcu_read_lock() ensures that
 	 * if uuid->net is not NULL, then calling nfsd_serv_try_get() is safe
 	 * and if it succeeds we will have an implied reference to the net.
-	 *
-	 * Otherwise NFS may not have ref on NFSD and therefore cannot safely
-	 * make 'nfs_to' calls.
 	 */
 	rcu_read_lock();
 	net = rcu_dereference(uuid->net);
@@ -153,6 +146,7 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
  * but cannot be statically linked, because that will make the NFS
  * module always depend on the NFSD module.
  *
+ * [FIXME: must adjust following 2 paragraphs]
  * 'nfs_to' provides NFS access to NFSD functions needed for LOCALIO,
  * its lifetime is tightly coupled to the NFSD module and will always
  * be available to NFS LOCALIO because any successful client<->server
@@ -170,3 +164,63 @@ EXPORT_SYMBOL_GPL(nfs_open_local_fh);
  */
 const struct nfsd_localio_operations *nfs_to;
 EXPORT_SYMBOL_GPL(nfs_to);
+
+static DEFINE_SPINLOCK(nfs_to_nfsd_lock);
+static refcount_t nfs_to_ref;
+
+bool nfs_to_nfsd_localio_ops_get(void)
+{
+	spin_lock(&nfs_to_nfsd_lock);
+
+	/* Only get nfsd_localio_operations on first reference */
+	if (refcount_read(&nfs_to_ref) == 0) {
+		refcount_set(&nfs_to_ref, 1);
+		/* fallthru */
+	} else {
+		refcount_inc(&nfs_to_ref);
+		spin_unlock(&nfs_to_nfsd_lock);
+		return true;
+	}
+
+	/* Must drop spinlock before call to symbol_request */
+	spin_unlock(&nfs_to_nfsd_lock);
+
+	/*
+	 * If NFSD isn't available LOCALIO isn't possible.
+	 * Use nfsd_open_local_fh symbol as the bellwether, if
+	 * available then nfs_common has NFSD module reference
+	 * on NFS's behalf and can safely call 'nfs_to' functions.
+	 */
+	if (!symbol_request(nfsd_open_local_fh))
+		return false;
+	return true;
+}
+EXPORT_SYMBOL_GPL(nfs_to_nfsd_localio_ops_get);
+
+void nfs_to_nfsd_localio_ops_put(void)
+{
+	spin_lock(&nfs_to_nfsd_lock);
+
+	if (!refcount_dec_and_test(&nfs_to_ref))
+		goto out;
+
+	symbol_put(nfsd_open_local_fh);
+	nfs_to = NULL;
+out:
+	spin_unlock(&nfs_to_nfsd_lock);
+}
+EXPORT_SYMBOL_GPL(nfs_to_nfsd_localio_ops_put);
+
+static int __init nfslocalio_init(void)
+{
+	refcount_set(&nfs_to_ref, 0);
+
+	return 0;
+}
+
+static void __exit nfslocalio_exit(void)
+{
+}
+
+module_init(nfslocalio_init);
+module_exit(nfslocalio_exit);
diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c
index 291e9c69cae4..291ad916d67a 100644
--- a/fs/nfsd/localio.c
+++ b/fs/nfsd/localio.c
@@ -114,7 +114,7 @@ static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp)
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	nfs_uuid_is_local(&argp->uuid, &nn->local_clients,
-			  net, rqstp->rq_client, THIS_MODULE);
+			  net, rqstp->rq_client);
 
 	return rpc_success;
 }
diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h
index b353abe00357..2e6b9107a7d1 100644
--- a/include/linux/nfslocalio.h
+++ b/include/linux/nfslocalio.h
@@ -35,7 +35,7 @@ typedef struct {
 void nfs_uuid_begin(nfs_uuid_t *);
 void nfs_uuid_end(nfs_uuid_t *);
 void nfs_uuid_is_local(const uuid_t *, struct list_head *,
-		       struct net *, struct auth_domain *, struct module *);
+		       struct net *, struct auth_domain *);
 void nfs_uuid_invalidate_clients(struct list_head *list);
 void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid);
 
@@ -58,9 +58,12 @@ struct nfsd_localio_operations {
 	struct file *(*nfsd_file_file)(struct nfsd_file *);
 } ____cacheline_aligned;
 
-extern void nfsd_localio_ops_init(void);
 extern const struct nfsd_localio_operations *nfs_to;
 
+extern void nfsd_localio_ops_init(void);
+bool nfs_to_nfsd_localio_ops_get(void);
+void nfs_to_nfsd_localio_ops_put(void);
+
 struct nfsd_file *nfs_open_local_fh(nfs_uuid_t *,
 		   struct rpc_clnt *, const struct cred *,
 		   const struct nfs_fh *, const fmode_t);
-- 
2.39.3





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux