> On Dec 2, 2024, at 11:28 PM, NeilBrown <neilb@xxxxxxx> wrote: > > On Tue, 03 Dec 2024, Chuck Lever III wrote: >> >> >>> On Nov 21, 2024, at 5:29 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: >>> >>> >>> >>>> On Nov 21, 2024, at 4:47 PM, NeilBrown <neilb@xxxxxxx> wrote: >>>> >>>> On Wed, 20 Nov 2024, Chuck Lever wrote: >>>>> On Wed, Nov 20, 2024 at 09:35:00AM +1100, NeilBrown wrote: >>>>>> On Wed, 20 Nov 2024, Chuck Lever wrote: >>>>>>> On Tue, Nov 19, 2024 at 11:41:32AM +1100, NeilBrown wrote: >>>>>>>> Reducing the number of slots in the session slot table requires >>>>>>>> confirmation from the client. This patch adds reduce_session_slots() >>>>>>>> which starts the process of getting confirmation, but never calls it. >>>>>>>> That will come in a later patch. >>>>>>>> >>>>>>>> Before we can free a slot we need to confirm that the client won't try >>>>>>>> to use it again. This involves returning a lower cr_maxrequests in a >>>>>>>> SEQUENCE reply and then seeing a ca_maxrequests on the same slot which >>>>>>>> is not larger than we limit we are trying to impose. So for each slot >>>>>>>> we need to remember that we have sent a reduced cr_maxrequests. >>>>>>>> >>>>>>>> To achieve this we introduce a concept of request "generations". Each >>>>>>>> time we decide to reduce cr_maxrequests we increment the generation >>>>>>>> number, and record this when we return the lower cr_maxrequests to the >>>>>>>> client. When a slot with the current generation reports a low >>>>>>>> ca_maxrequests, we commit to that level and free extra slots. >>>>>>>> >>>>>>>> We use an 8 bit generation number (64 seems wasteful) and if it cycles >>>>>>>> we iterate all slots and reset the generation number to avoid false matches. >>>>>>>> >>>>>>>> When we free a slot we store the seqid in the slot pointer so that it can >>>>>>>> be restored when we reactivate the slot. The RFC can be read as >>>>>>>> suggesting that the slot number could restart from one after a slot is >>>>>>>> retired and reactivated, but also suggests that retiring slots is not >>>>>>>> required. So when we reactive a slot we accept with the next seqid in >>>>>>>> sequence, or 1. >>>>>>>> >>>>>>>> When decoding sa_highest_slotid into maxslots we need to add 1 - this >>>>>>>> matches how it is encoded for the reply. >>>>>>>> >>>>>>>> Signed-off-by: NeilBrown <neilb@xxxxxxx> >>>>>>>> --- >>>>>>>> fs/nfsd/nfs4state.c | 81 ++++++++++++++++++++++++++++++++++++++------- >>>>>>>> fs/nfsd/nfs4xdr.c | 5 +-- >>>>>>>> fs/nfsd/state.h | 4 +++ >>>>>>>> fs/nfsd/xdr4.h | 2 -- >>>>>>>> 4 files changed, 76 insertions(+), 16 deletions(-) >>>>>>>> >>>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>>>>>>> index fb522165b376..0625b0aec6b8 100644 >>>>>>>> --- a/fs/nfsd/nfs4state.c >>>>>>>> +++ b/fs/nfsd/nfs4state.c >>>>>>>> @@ -1910,17 +1910,55 @@ gen_sessionid(struct nfsd4_session *ses) >>>>>>>> #define NFSD_MIN_HDR_SEQ_SZ (24 + 12 + 44) >>>>>>>> >>>>>>>> static void >>>>>>>> -free_session_slots(struct nfsd4_session *ses) >>>>>>>> +free_session_slots(struct nfsd4_session *ses, int from) >>>>>>>> { >>>>>>>> int i; >>>>>>>> >>>>>>>> - for (i = 0; i < ses->se_fchannel.maxreqs; i++) { >>>>>>>> + if (from >= ses->se_fchannel.maxreqs) >>>>>>>> + return; >>>>>>>> + >>>>>>>> + for (i = from; i < ses->se_fchannel.maxreqs; i++) { >>>>>>>> struct nfsd4_slot *slot = xa_load(&ses->se_slots, i); >>>>>>>> >>>>>>>> - xa_erase(&ses->se_slots, i); >>>>>>>> + /* >>>>>>>> + * Save the seqid in case we reactivate this slot. >>>>>>>> + * This will never require a memory allocation so GFP >>>>>>>> + * flag is irrelevant >>>>>>>> + */ >>>>>>>> + xa_store(&ses->se_slots, i, xa_mk_value(slot->sl_seqid), >>>>>>>> + GFP_ATOMIC); >>>>>>> >>>>>>> Again... ATOMIC is probably not what we want here, even if it is >>>>>>> only documentary. >>>>>> >>>>>> Why not? It might be called under a spinlock so GFP_KERNEL might trigger >>>>>> a warning. >>>>> >>>>> I find using GFP_ATOMIC here to be confusing -- it requests >>>>> allocation from special memory reserves and is to be used in >>>>> situations where allocation might result in system failure. That is >>>>> clearly not the case here, and the resulting memory allocation might >>>>> be long-lived. >>>> >>>> Would you be comfortable with GFP_NOWAIT which leaves out __GFP_HIGH ?? >>> >>> I will be comfortable when I hear back from Matthew and Liam. >>> >>> :-) >>> >>> >>>>> I see the comment that says memory won't actually be allocated. I'm >>>>> not sure that's the way xa_store() works, however. >>>> >>>> xarray.rst says: >>>> >>>> The xa_store(), xa_cmpxchg(), xa_alloc(), >>>> xa_reserve() and xa_insert() functions take a gfp_t >>>> parameter in case the XArray needs to allocate memory to store this entry. >>>> If the entry is being deleted, no memory allocation needs to be performed, >>>> and the GFP flags specified will be ignored.` >>>> >>>> The particular context is that a normal pointer is currently stored a >>>> the given index, and we are replacing that with a number. The above >>>> doesn't explicitly say that won't require a memory allocation, but I >>>> think it is reasonable to say it won't need "to allocate memory to store >>>> this entry" - as an entry is already stored - so allocation should not >>>> be needed. >>> >>> xa_mk_value() converts a number to a pointer, and xa_store >>> stores that pointer. >>> >>> I suspect that xa_store() is allowed to rebalance the >>> xarray's internal data structures, and that could result >>> in memory release or allocation. That's why a GFP flag is >>> one of the arguments. >> >> Matthew says the xa_store() is guaranteed not to do a memory >> allocation in this case. However, they prefer an annotation >> of the call site with a "0" GFP argument to show that the >> allocation flags are not relevant. >> >> Does this: >> >> xa_store(&ses->se_slots, i, xa_mk_value(slot->sl_seqid), 0); >> >> work for you? > > Sure. > And it looks like sparse will be happy even though "0" isn't explicitly > "gfp_t" because 0 is "special". > > I might prefer GFP_NULL or similar, but 0 certainly works for me. I'll > include that when I resend. Matthew suggested GFP_NOALLOC. But neither of these symbolic flags exist yet. I'd rather not hold up this series behind the bikeshedding of the flag name ;-) -- Chuck Lever