Re: [PATCH v5 12/13] NFSD: Allocate an rhashtable for nfs4_file objects

"NeilBrown" <neilb@xxxxxxx> · Tue, 25 Oct 2022 13:26:50 +1100

On Tue, 25 Oct 2022, Chuck Lever III wrote:
> 
> > On Oct 24, 2022, at 7:37 PM, NeilBrown <neilb@xxxxxxx> wrote:
> > 
> > On Tue, 25 Oct 2022, Chuck Lever wrote:
> >> Introduce the infrastructure for managing nfs4_file objects in an
> >> rhashtable. This infrastructure will be used by the next patch.
> >> 
> >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >> ---
> >> fs/nfsd/nfs4state.c |   23 ++++++++++++++++++++++-
> >> fs/nfsd/state.h     |    1 +
> >> 2 files changed, 23 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >> index abed795bb4ec..681cb2daa843 100644
> >> --- a/fs/nfsd/nfs4state.c
> >> +++ b/fs/nfsd/nfs4state.c
> >> @@ -44,7 +44,9 @@
> >> #include <linux/jhash.h>
> >> #include <linux/string_helpers.h>
> >> #include <linux/fsnotify.h>
> >> +#include <linux/rhashtable.h>
> >> #include <linux/nfs_ssc.h>
> >> +
> >> #include "xdr4.h"
> >> #include "xdr4cb.h"
> >> #include "vfs.h"
> >> @@ -721,6 +723,18 @@ static unsigned int file_hashval(const struct svc_fh *fh)
> >> 
> >> static struct hlist_head file_hashtbl[FILE_HASH_SIZE];
> >> 
> >> +static struct rhltable nfs4_file_rhltable ____cacheline_aligned_in_smp;
> >> +
> >> +static const struct rhashtable_params nfs4_file_rhash_params = {
> >> +	.key_len		= sizeof_field(struct nfs4_file, fi_inode),
> >> +	.key_offset		= offsetof(struct nfs4_file, fi_inode),
> >> +	.head_offset		= offsetof(struct nfs4_file, fi_rlist),
> >> +
> >> +	/* Reduce resizing churn on light workloads */
> >> +	.min_size		= 256,		/* buckets */
> > 
> > Every time I see this line a groan a little bit.  Probably I'm groaning
> > at rhashtable - you shouldn't need to have to worry about these sorts of
> > details when using an API...  but I agree that avoiding churn is likely
> > a good idea.
> > 
> > Where did 256 come from?
> 
> Here's the current file_hashtbl definition:
> 
>  710 /* hash table for nfs4_file */
>  711 #define FILE_HASH_BITS                   8
>  712 #define FILE_HASH_SIZE                  (1 << FILE_HASH_BITS)
>  713 
>  714 static unsigned int file_hashval(const struct svc_fh *fh)
>  715 {
>  716         struct inode *inode = d_inode(fh->fh_dentry);
>  717 
>  718         /* XXX: why not (here & in file cache) use inode? */
>  719         return (unsigned int)hash_long(inode->i_ino, FILE_HASH_BITS);
>  720 }
>  721 
>  722 static struct hlist_head file_hashtbl[FILE_HASH_SIZE];
> 
> 256 buckets is the size of the existing file_hashtbl.
> 
> 
> >  Would PAGE_SIZE/sizeof(void*) make more sense
> > (though that is 512).
> 
> For rhashtable, you need to account for sizeof(struct bucket_table),
> if I'm reading nested_bucket_table_alloc() correctly.
> 
> 256 is 2048 bytes + sizeof(struct bucket_table). 512 buckets would
> push us over a page.

Ah yes, of course.  The does suggest that 256 is a better choice.

> 
> 
> > How much churn is too much?  The default is 4 and we grow at >75% and
> > shrink at <30%, so at 4 entries the table would resize to 8, and that 2
> > entries it  would shrink back.  That does sound like churn.
> > If we choose 8, then at 7 we grow to 16 and at 4 we go back to 8.
> > If we choose 16, then at 13 we grow to 32 and at 9 we go back to 16.
> > 
> > If we choose 64, then at 49 we grow to 128 and at 39 we go back to 64.
> > 
> > The margin seems rather narrow.  May 30% is too high - 15% might be a
> > lot better.  But changing that might be difficult.
> 
> I could go a little smaller. Basically table resizing is OK when we're
> talking about a lot of buckets because that overhead is very likely to
> be amortized over many insertions and removals.
> 
> 
> > So I don't have a good recommendation, but I don't like magic numbers.
> > Maybe PAGE_SIZE/sizeof(void*) ??
> 
> The only thing I can think of would be
> 
> #define NFS4_FILE_HASH_SIZE  (some number or constant calculation)
> 
> which to me isn't much better than
> 
> 	.size	= 256,	/* buckets */
> 
> I will ponder some more.
> 

Maybe just a comment saying that this number will allow the
struct bucket_table to fit in one 4K page.

Thanks,
NeilBrown

> 
> > But either way
> >  Reviewed-by: NeilBrown <neilb@xxxxxxx>
> > 
> > Thanks,
> > NeilBrown
> > 
> > 
> >> +	.automatic_shrinking	= true,
> >> +};
> >> +
> >> /*
> >>  * Check if courtesy clients have conflicting access and resolve it if possible
> >>  *
> >> @@ -8023,10 +8037,16 @@ nfs4_state_start(void)
> >> {
> >> 	int ret;
> >> 
> >> -	ret = nfsd4_create_callback_queue();
> >> +	ret = rhltable_init(&nfs4_file_rhltable, &nfs4_file_rhash_params);
> >> 	if (ret)
> >> 		return ret;
> >> 
> >> +	ret = nfsd4_create_callback_queue();
> >> +	if (ret) {
> >> +		rhltable_destroy(&nfs4_file_rhltable);
> >> +		return ret;
> >> +	}
> >> +
> >> 	set_max_delegations();
> >> 	return 0;
> >> }
> >> @@ -8057,6 +8077,7 @@ nfs4_state_shutdown_net(struct net *net)
> >> 
> >> 	nfsd4_client_tracking_exit(net);
> >> 	nfs4_state_destroy_net(net);
> >> +	rhltable_destroy(&nfs4_file_rhltable);
> >> #ifdef CONFIG_NFSD_V4_2_INTER_SSC
> >> 	nfsd4_ssc_shutdown_umount(nn);
> >> #endif
> >> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> >> index e2daef3cc003..190fc7e418a4 100644
> >> --- a/fs/nfsd/state.h
> >> +++ b/fs/nfsd/state.h
> >> @@ -546,6 +546,7 @@ struct nfs4_file {
> >> 	bool			fi_aliased;
> >> 	spinlock_t		fi_lock;
> >> 	struct hlist_node       fi_hash;	/* hash on fi_fhandle */
> >> +	struct rhlist_head	fi_rlist;
> >> 	struct list_head        fi_stateids;
> >> 	union {
> >> 		struct list_head	fi_delegations;
> >> 
> >> 
> >> 
> 
> --
> Chuck Lever
> 
> 
> 
>