Re: [PATCH RFC v18 02/11] NFSD: Add courtesy client state, macro and spinlock to support courteous server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 29, 2022 at 07:32:57PM +0000, Chuck Lever III wrote:
> 
> 
> > On Mar 29, 2022, at 2:39 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> > 
> > On Tue, Mar 29, 2022 at 11:19:51AM -0700, dai.ngo@xxxxxxxxxx wrote:
> >> 
> >> On 3/29/22 9:30 AM, J. Bruce Fields wrote:
> >>> On Tue, Mar 29, 2022 at 09:20:02AM -0700, dai.ngo@xxxxxxxxxx wrote:
> >>>> On 3/29/22 8:47 AM, J. Bruce Fields wrote:
> >>>>> On Thu, Mar 24, 2022 at 09:34:42PM -0700, Dai Ngo wrote:
> >>>>>> Update nfs4_client to add:
> >>>>>> . cl_cs_client_state: courtesy client state
> >>>>>> . cl_cs_lock: spinlock to synchronize access to cl_cs_client_state
> >>>>>> . cl_cs_list: list used by laundromat to process courtesy clients
> >>>>>> 
> >>>>>> Modify alloc_client to initialize these fields.
> >>>>>> 
> >>>>>> Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx>
> >>>>>> ---
> >>>>>> fs/nfsd/nfs4state.c |  2 ++
> >>>>>> fs/nfsd/nfsd.h      |  1 +
> >>>>>> fs/nfsd/state.h     | 33 +++++++++++++++++++++++++++++++++
> >>>>>> 3 files changed, 36 insertions(+)
> >>>>>> 
> >>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> >>>>>> index 234e852fcdfa..a65d59510681 100644
> >>>>>> --- a/fs/nfsd/nfs4state.c
> >>>>>> +++ b/fs/nfsd/nfs4state.c
> >>>>>> @@ -2009,12 +2009,14 @@ static struct nfs4_client *alloc_client(struct xdr_netobj name)
> >>>>>> 	INIT_LIST_HEAD(&clp->cl_delegations);
> >>>>>> 	INIT_LIST_HEAD(&clp->cl_lru);
> >>>>>> 	INIT_LIST_HEAD(&clp->cl_revoked);
> >>>>>> +	INIT_LIST_HEAD(&clp->cl_cs_list);
> >>>>>> #ifdef CONFIG_NFSD_PNFS
> >>>>>> 	INIT_LIST_HEAD(&clp->cl_lo_states);
> >>>>>> #endif
> >>>>>> 	INIT_LIST_HEAD(&clp->async_copies);
> >>>>>> 	spin_lock_init(&clp->async_lock);
> >>>>>> 	spin_lock_init(&clp->cl_lock);
> >>>>>> +	spin_lock_init(&clp->cl_cs_lock);
> >>>>>> 	rpc_init_wait_queue(&clp->cl_cb_waitq, "Backchannel slot table");
> >>>>>> 	return clp;
> >>>>>> err_no_hashtbl:
> >>>>>> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> >>>>>> index 4fc1fd639527..23996c6ca75e 100644
> >>>>>> --- a/fs/nfsd/nfsd.h
> >>>>>> +++ b/fs/nfsd/nfsd.h
> >>>>>> @@ -336,6 +336,7 @@ void		nfsd_lockd_shutdown(void);
> >>>>>> #define COMPOUND_ERR_SLACK_SPACE	16     /* OP_SETATTR */
> >>>>>> #define NFSD_LAUNDROMAT_MINTIMEOUT      1   /* seconds */
> >>>>>> +#define	NFSD_COURTESY_CLIENT_TIMEOUT	(24 * 60 * 60)	/* seconds */
> >>>>>> /*
> >>>>>>  * The following attributes are currently not supported by the NFSv4 server:
> >>>>>> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> >>>>>> index 95457cfd37fc..40e390abc842 100644
> >>>>>> --- a/fs/nfsd/state.h
> >>>>>> +++ b/fs/nfsd/state.h
> >>>>>> @@ -283,6 +283,35 @@ struct nfsd4_sessionid {
> >>>>>> #define HEXDIR_LEN     33 /* hex version of 16 byte md5 of cl_name plus '\0' */
> >>>>>> /*
> >>>>>> + * CLIENT_  CLIENT_ CLIENT_
> >>>>>> + * COURTESY EXPIRED RECONNECTED      Meaning                  Where set
> >>>>>> + * -----------------------------------------------------------------------------
> >>>>>> + * | false | false | false | Confirmed, active    | Default                    |
> >>>>>> + * |---------------------------------------------------------------------------|
> >>>>>> + * | true  | false | false | Courtesy state.      | nfs4_get_client_reaplist   |
> >>>>>> + * |       |       |       | Lease/lock/share     |                            |
> >>>>>> + * |       |       |       | reservation conflict |                            |
> >>>>>> + * |       |       |       | can cause Courtesy   |                            |
> >>>>>> + * |       |       |       | client to be expired |                            |
> >>>>>> + * |---------------------------------------------------------------------------|
> >>>>>> + * | false | true  | false | Courtesy client to be| nfs4_laundromat            |
> >>>>>> + * |       |       |       | expired by Laundromat| nfsd4_lm_lock_expired      |
> >>>>>> + * |       |       |       | due to conflict     | nfsd4_discard_courtesy_clnt |
> >>>>>> + * |       |       |       |                      | nfsd4_expire_courtesy_clnt |
> >>>>>> + * |---------------------------------------------------------------------------|
> >>>>>> + * | false | false | true  | Courtesy client      | nfsd4_courtesy_clnt_expired|
> >>>>>> + * |       |       |       | reconnected,         |                            |
> >>>>>> + * |       |       |       | becoming active      |                            |
> >>>>>> + * -----------------------------------------------------------------------------
> >>> By the way, where is a client returned to the normal (0) state?  That
> >>> has to happen at some point.
> >> 
> >> For 4.1 courtesy client reconnects is detected in nfsd4_sequence,
> >> nfsd4_bind_conn_to_session.
> > 
> > Those are the places where NFSD54_CLIENT_RECONNECTED is set, which isn't
> > the question I asked.
> 
> "reconnected" simply means the client has gotten back in touch.

Again, my question was: when is cl_cs_client_state set back to 0?  As
far as I can tell, the answer is never.  That means, even long after the
client has reconnected, it's left in a weird state where it can be
suddenly expired for all sorts of reasons.

> The server then has to decide whether to allow the client to
> become active again or it needs to purge it. That decision
> is different for each operation and minor version. Look for
> "if (cl_cs_client_state == NFSD4_CLIENT_RECONNECTED)" for how
> those choices are made.
> 
> 
> >>> Why are RECONNECTED clients discarded in so many cases?  (E.g. whenever
> >>> a bind_conn_to_session fails).
> >> 
> >> find_in_sessionid_hashtbl: we discard the courtesy client when it
> >> reconnects and there is error from nfsd4_get_session_locked. This
> >> should be a rare condition so rather than reverting the client
> >> state back to courtesy, it is simpler just to discard it.
> > 
> > That may be a rare situation, but I don't believe the behavior of
> > discarding the client in this case is correct.
> 
> Can you explain this? It's a courtesy client... the server can
> decide it's expired at that point, can't it? IOW what breaks?

I'm not worried about courtesy clients, I'm worried about clients that
were courtesy clients but have since succesfully renewed their state.
Expiring them for a failed bind_conn_to_session isn't right.

--b.

> 
> 
> >> nfsd4_create_session/find_confirmed_client: I think the only time
> >> the courtesy client sends CREATE_SESSION, before sending the SEQUENCE
> >> to reconnect after missing its leases, is when it wants to do clientid
> >> trunking. This should be a rare condition so instead of dealing
> >> with it we just do not allow it and discard the client for now.
> > 
> > We can't wave away incorrect behavior with "but it's rare".  Users with
> > heavy and/or unusual workloads hit rare conditions.  Clients may change
> > their behavior over time.  (E.g., trunking may become more common.)
> 
> 
> --
> Chuck Lever
> 
> 



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux