On Tue, 03 Dec 2024, Chuck Lever III wrote: > > > > On Nov 21, 2024, at 5:29 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > > > > > > > >> On Nov 21, 2024, at 4:47 PM, NeilBrown <neilb@xxxxxxx> wrote: > >> > >> On Wed, 20 Nov 2024, Chuck Lever wrote: > >>> On Wed, Nov 20, 2024 at 09:35:00AM +1100, NeilBrown wrote: > >>>> On Wed, 20 Nov 2024, Chuck Lever wrote: > >>>>> On Tue, Nov 19, 2024 at 11:41:32AM +1100, NeilBrown wrote: > >>>>>> Reducing the number of slots in the session slot table requires > >>>>>> confirmation from the client. This patch adds reduce_session_slots() > >>>>>> which starts the process of getting confirmation, but never calls it. > >>>>>> That will come in a later patch. > >>>>>> > >>>>>> Before we can free a slot we need to confirm that the client won't try > >>>>>> to use it again. This involves returning a lower cr_maxrequests in a > >>>>>> SEQUENCE reply and then seeing a ca_maxrequests on the same slot which > >>>>>> is not larger than we limit we are trying to impose. So for each slot > >>>>>> we need to remember that we have sent a reduced cr_maxrequests. > >>>>>> > >>>>>> To achieve this we introduce a concept of request "generations". Each > >>>>>> time we decide to reduce cr_maxrequests we increment the generation > >>>>>> number, and record this when we return the lower cr_maxrequests to the > >>>>>> client. When a slot with the current generation reports a low > >>>>>> ca_maxrequests, we commit to that level and free extra slots. > >>>>>> > >>>>>> We use an 8 bit generation number (64 seems wasteful) and if it cycles > >>>>>> we iterate all slots and reset the generation number to avoid false matches. > >>>>>> > >>>>>> When we free a slot we store the seqid in the slot pointer so that it can > >>>>>> be restored when we reactivate the slot. The RFC can be read as > >>>>>> suggesting that the slot number could restart from one after a slot is > >>>>>> retired and reactivated, but also suggests that retiring slots is not > >>>>>> required. So when we reactive a slot we accept with the next seqid in > >>>>>> sequence, or 1. > >>>>>> > >>>>>> When decoding sa_highest_slotid into maxslots we need to add 1 - this > >>>>>> matches how it is encoded for the reply. > >>>>>> > >>>>>> Signed-off-by: NeilBrown <neilb@xxxxxxx> > >>>>>> --- > >>>>>> fs/nfsd/nfs4state.c | 81 ++++++++++++++++++++++++++++++++++++++------- > >>>>>> fs/nfsd/nfs4xdr.c | 5 +-- > >>>>>> fs/nfsd/state.h | 4 +++ > >>>>>> fs/nfsd/xdr4.h | 2 -- > >>>>>> 4 files changed, 76 insertions(+), 16 deletions(-) > >>>>>> > >>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > >>>>>> index fb522165b376..0625b0aec6b8 100644 > >>>>>> --- a/fs/nfsd/nfs4state.c > >>>>>> +++ b/fs/nfsd/nfs4state.c > >>>>>> @@ -1910,17 +1910,55 @@ gen_sessionid(struct nfsd4_session *ses) > >>>>>> #define NFSD_MIN_HDR_SEQ_SZ (24 + 12 + 44) > >>>>>> > >>>>>> static void > >>>>>> -free_session_slots(struct nfsd4_session *ses) > >>>>>> +free_session_slots(struct nfsd4_session *ses, int from) > >>>>>> { > >>>>>> int i; > >>>>>> > >>>>>> - for (i = 0; i < ses->se_fchannel.maxreqs; i++) { > >>>>>> + if (from >= ses->se_fchannel.maxreqs) > >>>>>> + return; > >>>>>> + > >>>>>> + for (i = from; i < ses->se_fchannel.maxreqs; i++) { > >>>>>> struct nfsd4_slot *slot = xa_load(&ses->se_slots, i); > >>>>>> > >>>>>> - xa_erase(&ses->se_slots, i); > >>>>>> + /* > >>>>>> + * Save the seqid in case we reactivate this slot. > >>>>>> + * This will never require a memory allocation so GFP > >>>>>> + * flag is irrelevant > >>>>>> + */ > >>>>>> + xa_store(&ses->se_slots, i, xa_mk_value(slot->sl_seqid), > >>>>>> + GFP_ATOMIC); > >>>>> > >>>>> Again... ATOMIC is probably not what we want here, even if it is > >>>>> only documentary. > >>>> > >>>> Why not? It might be called under a spinlock so GFP_KERNEL might trigger > >>>> a warning. > >>> > >>> I find using GFP_ATOMIC here to be confusing -- it requests > >>> allocation from special memory reserves and is to be used in > >>> situations where allocation might result in system failure. That is > >>> clearly not the case here, and the resulting memory allocation might > >>> be long-lived. > >> > >> Would you be comfortable with GFP_NOWAIT which leaves out __GFP_HIGH ?? > > > > I will be comfortable when I hear back from Matthew and Liam. > > > > :-) > > > > > >>> I see the comment that says memory won't actually be allocated. I'm > >>> not sure that's the way xa_store() works, however. > >> > >> xarray.rst says: > >> > >> The xa_store(), xa_cmpxchg(), xa_alloc(), > >> xa_reserve() and xa_insert() functions take a gfp_t > >> parameter in case the XArray needs to allocate memory to store this entry. > >> If the entry is being deleted, no memory allocation needs to be performed, > >> and the GFP flags specified will be ignored.` > >> > >> The particular context is that a normal pointer is currently stored a > >> the given index, and we are replacing that with a number. The above > >> doesn't explicitly say that won't require a memory allocation, but I > >> think it is reasonable to say it won't need "to allocate memory to store > >> this entry" - as an entry is already stored - so allocation should not > >> be needed. > > > > xa_mk_value() converts a number to a pointer, and xa_store > > stores that pointer. > > > > I suspect that xa_store() is allowed to rebalance the > > xarray's internal data structures, and that could result > > in memory release or allocation. That's why a GFP flag is > > one of the arguments. > > Matthew says the xa_store() is guaranteed not to do a memory > allocation in this case. However, they prefer an annotation > of the call site with a "0" GFP argument to show that the > allocation flags are not relevant. > > Does this: > > xa_store(&ses->se_slots, i, xa_mk_value(slot->sl_seqid), 0); > > work for you? Sure. And it looks like sparse will be happy even though "0" isn't explicitly "gfp_t" because 0 is "special". I might prefer GFP_NULL or similar, but 0 certainly works for me. I'll include that when I resend. Thanks NeilBrown