On 12/6/21 11:55 AM, Chuck Lever III wrote:
+
+/*
+ * Function to check if the nfserr_share_denied error for 'fp' resulted
+ * from conflict with courtesy clients then release their state to resolve
+ * the conflict.
+ *
+ * Function returns:
+ * 0 - no conflict with courtesy clients
+ * >0 - conflict with courtesy clients resolved, try access/deny check again
+ * -1 - conflict with courtesy clients being resolved in background
+ * return nfserr_jukebox to NFS client
+ */
+static int
+nfs4_destroy_clnts_with_sresv_conflict(struct svc_rqst *rqstp,
+ struct nfs4_file *fp, struct nfs4_ol_stateid *stp,
+ u32 access, bool share_access)
+{
+ int cnt = 0;
+ int async_cnt = 0;
+ bool no_retry = false;
+ struct nfs4_client *cl;
+ struct list_head *pos, *next, reaplist;
+ struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
+
+ INIT_LIST_HEAD(&reaplist);
+ spin_lock(&nn->client_lock);
+ list_for_each_safe(pos, next, &nn->client_lru) {
+ cl = list_entry(pos, struct nfs4_client, cl_lru);
+ /*
+ * check all nfs4_ol_stateid of this client
+ * for conflicts with 'access'mode.
+ */
+ if (nfs4_check_deny_bmap(cl, fp, stp, access, share_access)) {
+ if (!test_bit(NFSD4_COURTESY_CLIENT, &cl->cl_flags)) {
+ /* conflict with non-courtesy client */
+ no_retry = true;
+ cnt = 0;
+ goto out;
+ }
+ /*
+ * if too many to resolve synchronously
+ * then do the rest in background
+ */
+ if (cnt > 100) {
+ set_bit(NFSD4_DESTROY_COURTESY_CLIENT, &cl->cl_flags);
+ async_cnt++;
+ continue;
+ }
+ if (mark_client_expired_locked(cl))
+ continue;
+ cnt++;
+ list_add(&cl->cl_lru, &reaplist);
+ }
+ }
Bruce suggested simply returning NFS4ERR_DELAY for all cases.
That would simplify this quite a bit for what is a rare edge
case.
If we always do this asynchronously by returning NFS4ERR_DELAY
for all cases then the following pynfs tests need to be modified
to handle the error:
RENEW3 st_renew.testExpired : FAILURE
LKU10 st_locku.testTimedoutUnlock : FAILURE
CLOSE9 st_close.testTimedoutClose2 : FAILURE
and any new tests that opens file have to be prepared to handle
NFS4ERR_DELAY due to the lack of destroy_clientid in 4.0.
Do we still want to take this approach?
-Dai