Re: [PATCH 5/6] nfsd: add support for freeing unused session-DRC slots

"NeilBrown" <neilb@xxxxxxx> · Fri, 22 Nov 2024 08:47:33 +1100

On Wed, 20 Nov 2024, Chuck Lever wrote:
> On Wed, Nov 20, 2024 at 09:35:00AM +1100, NeilBrown wrote:
> > On Wed, 20 Nov 2024, Chuck Lever wrote:
> > > On Tue, Nov 19, 2024 at 11:41:32AM +1100, NeilBrown wrote:
> > > > Reducing the number of slots in the session slot table requires
> > > > confirmation from the client.  This patch adds reduce_session_slots()
> > > > which starts the process of getting confirmation, but never calls it.
> > > > That will come in a later patch.
> > > > 
> > > > Before we can free a slot we need to confirm that the client won't try
> > > > to use it again.  This involves returning a lower cr_maxrequests in a
> > > > SEQUENCE reply and then seeing a ca_maxrequests on the same slot which
> > > > is not larger than we limit we are trying to impose.  So for each slot
> > > > we need to remember that we have sent a reduced cr_maxrequests.
> > > > 
> > > > To achieve this we introduce a concept of request "generations".  Each
> > > > time we decide to reduce cr_maxrequests we increment the generation
> > > > number, and record this when we return the lower cr_maxrequests to the
> > > > client.  When a slot with the current generation reports a low
> > > > ca_maxrequests, we commit to that level and free extra slots.
> > > > 
> > > > We use an 8 bit generation number (64 seems wasteful) and if it cycles
> > > > we iterate all slots and reset the generation number to avoid false matches.
> > > > 
> > > > When we free a slot we store the seqid in the slot pointer so that it can
> > > > be restored when we reactivate the slot.  The RFC can be read as
> > > > suggesting that the slot number could restart from one after a slot is
> > > > retired and reactivated, but also suggests that retiring slots is not
> > > > required.  So when we reactive a slot we accept with the next seqid in
> > > > sequence, or 1.
> > > > 
> > > > When decoding sa_highest_slotid into maxslots we need to add 1 - this
> > > > matches how it is encoded for the reply.
> > > > 
> > > > Signed-off-by: NeilBrown <neilb@xxxxxxx>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 81 ++++++++++++++++++++++++++++++++++++++-------
> > > >  fs/nfsd/nfs4xdr.c   |  5 +--
> > > >  fs/nfsd/state.h     |  4 +++
> > > >  fs/nfsd/xdr4.h      |  2 --
> > > >  4 files changed, 76 insertions(+), 16 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index fb522165b376..0625b0aec6b8 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -1910,17 +1910,55 @@ gen_sessionid(struct nfsd4_session *ses)
> > > >  #define NFSD_MIN_HDR_SEQ_SZ  (24 + 12 + 44)
> > > >  
> > > >  static void
> > > > -free_session_slots(struct nfsd4_session *ses)
> > > > +free_session_slots(struct nfsd4_session *ses, int from)
> > > >  {
> > > >  	int i;
> > > >  
> > > > -	for (i = 0; i < ses->se_fchannel.maxreqs; i++) {
> > > > +	if (from >= ses->se_fchannel.maxreqs)
> > > > +		return;
> > > > +
> > > > +	for (i = from; i < ses->se_fchannel.maxreqs; i++) {
> > > >  		struct nfsd4_slot *slot = xa_load(&ses->se_slots, i);
> > > >  
> > > > -		xa_erase(&ses->se_slots, i);
> > > > +		/*
> > > > +		 * Save the seqid in case we reactivate this slot.
> > > > +		 * This will never require a memory allocation so GFP
> > > > +		 * flag is irrelevant
> > > > +		 */
> > > > +		xa_store(&ses->se_slots, i, xa_mk_value(slot->sl_seqid),
> > > > +			 GFP_ATOMIC);
> > > 
> > > Again... ATOMIC is probably not what we want here, even if it is
> > > only documentary.
> > 
> > Why not?  It might be called under a spinlock so GFP_KERNEL might trigger
> > a warning.
> 
> I find using GFP_ATOMIC here to be confusing -- it requests
> allocation from special memory reserves and is to be used in
> situations where allocation might result in system failure. That is
> clearly not the case here, and the resulting memory allocation might
> be long-lived.

Would you be comfortable with GFP_NOWAIT which leaves out __GFP_HIGH ??

My understanding of how GFP_ATOMIC is used is it is what people choose
when they have to allocate in a no-sleep context.  It can fail and there
must always be a fall-back option.  In many cases GFP_NOWAIT could
possibly be used when it isn't a high priority, but there are 430 uses
for GFP_NOWAIT compared with over 5000 of GFP_ATOMIC.

> 
> I see the comment that says memory won't actually be allocated. I'm
> not sure that's the way xa_store() works, however.

xarray.rst says:

  The xa_store(), xa_cmpxchg(), xa_alloc(),
  xa_reserve() and xa_insert() functions take a gfp_t
  parameter in case the XArray needs to allocate memory to store this entry.
  If the entry is being deleted, no memory allocation needs to be performed,
  and the GFP flags specified will be ignored.`

The particular context is that a normal pointer is currently stored a
the given index, and we are replacing that with a number.  The above
doesn't explicitly say that won't require a memory allocation, but I
think it is reasonable to say it won't need "to allocate memory to store
this entry" - as an entry is already stored - so allocation should not
be needed.

> 
> I don't immediately see another good choice, however. I can reach
> out to Matthew and Liam and see if they have a better idea.
> 
> 
> > > And, I thought we determined that an unretired slot had a sequence
> > > number that is reset. Why save the slot's seqid? If I'm missing
> > > something, the comment here should be bolstered to explain it.
> > 
> > It isn't clear to me that we determined that - only the some people
> > asserted it.
> 
> From what I've read, everyone else who responded has said "use one".
> And they have provided enough spec quotations that 1 seems like the
> right initial slot sequence number value, always.
> 
> You should trust Tom Talpey's opinion on this. He was directly
> involved 25 years ago when sessions were invented in DAFS and then
> transferred into the NFSv4.1 protocol.

Dave Noveck (also deeply involved) say:

   It does.  The problem is that it undercuts the core goal of the
   slot-based approach 
   In that it makes it possible to have multiple requests with the same
   session id/ slot ID / sequence triple.

i.e.  resetting to 1 undercuts the core goal....  That is not a
resounding endorsement.

While I respect the people, I prefer to trust reasoned arguments.

What exactly is the sequence number protecting against?  It must be
protecting against a request being sent, not reply received, connection
closed, request sent on another connection, reply received.  Original
request getting to the server before the "close" message.  Server must
be sure not to handle this request.  sequence number provides that.

But what if the above all happens for request with seqno of 1, then the
server and client negotiate a "retiring" of slot 1 and then it's reuse
before the original request arrives.  How does the server know to ignore
it?

And Tom said that the handling of maxreq etc is "optional" for both
server and client.  So how can the server know if the client has retired
the slot when doing so is optional???

I really don't think we have clarity on this at all.

> 
> 
> > Until the spec is clarified I think it is safest to be cautious.
> 
> The usual line we draw for adding code/features/complexity is the
> proposer must demonstrate a use case for it. So far I have not seen
> a client implementation that needs a server to remember the sequence
> number in a slot that has been shrunken and then re-activated.

And I cannot in good faith submit code that I think violates the spec.

> 
> Will this dead slot be subject to being freed by the session
> shrinker?

No, but it uses much much less space than a slot.

> 
> But the proposed implementation accepts 1 in this case, and it
> doesn't seem tremendously difficult to remove the "remember the
> seqid" mechanism once it has been codified to everyone's
> satisfaction. So I won't belabor the point.

Thank you!

NeilBrown