On Wed, 2021-12-08 at 07:54 -0800, dai.ngo@xxxxxxxxxx wrote: > On 12/6/21 11:55 AM, Chuck Lever III wrote: > > > > > > + > > > +/* > > > + * Function to check if the nfserr_share_denied error for 'fp' > > > resulted > > > + * from conflict with courtesy clients then release their state to > > > resolve > > > + * the conflict. > > > + * > > > + * Function returns: > > > + * 0 - no conflict with courtesy clients > > > + * >0 - conflict with courtesy clients resolved, try > > > access/deny check again > > > + * -1 - conflict with courtesy clients being resolved in > > > background > > > + * return nfserr_jukebox to NFS client > > > + */ > > > +static int > > > +nfs4_destroy_clnts_with_sresv_conflict(struct svc_rqst *rqstp, > > > + struct nfs4_file *fp, struct > > > nfs4_ol_stateid *stp, > > > + u32 access, bool share_access) > > > +{ > > > + int cnt = 0; > > > + int async_cnt = 0; > > > + bool no_retry = false; > > > + struct nfs4_client *cl; > > > + struct list_head *pos, *next, reaplist; > > > + struct nfsd_net *nn = net_generic(SVC_NET(rqstp), > > > nfsd_net_id); > > > + > > > + INIT_LIST_HEAD(&reaplist); > > > + spin_lock(&nn->client_lock); > > > + list_for_each_safe(pos, next, &nn->client_lru) { > > > + cl = list_entry(pos, struct nfs4_client, cl_lru); > > > + /* > > > + * check all nfs4_ol_stateid of this client > > > + * for conflicts with 'access'mode. > > > + */ > > > + if (nfs4_check_deny_bmap(cl, fp, stp, access, > > > share_access)) { > > > + if (!test_bit(NFSD4_COURTESY_CLIENT, &cl- > > > >cl_flags)) { > > > + /* conflict with non-courtesy > > > client */ > > > + no_retry = true; > > > + cnt = 0; > > > + goto out; > > > + } > > > + /* > > > + * if too many to resolve synchronously > > > + * then do the rest in background > > > + */ > > > + if (cnt > 100) { > > > + set_bit(NFSD4_DESTROY_COURTESY_CLIE > > > NT, &cl->cl_flags); > > > + async_cnt++; > > > + continue; > > > + } > > > + if (mark_client_expired_locked(cl)) > > > + continue; > > > + cnt++; > > > + list_add(&cl->cl_lru, &reaplist); > > > + } > > > + } > > Bruce suggested simply returning NFS4ERR_DELAY for all cases. > > That would simplify this quite a bit for what is a rare edge > > case. > > If we always do this asynchronously by returning NFS4ERR_DELAY > for all cases then the following pynfs tests need to be modified > to handle the error: > > RENEW3 st_renew.testExpired : > FAILURE > LKU10 st_locku.testTimedoutUnlock : > FAILURE > CLOSE9 st_close.testTimedoutClose2 : > FAILURE > > and any new tests that opens file have to be prepared to handle > NFS4ERR_DELAY due to the lack of destroy_clientid in 4.0. > > Do we still want to take this approach? NFS4ERR_DELAY is a valid error for both CLOSE and LOCKU (see RFC7530 section 13.2 https://datatracker.ietf.org/doc/html/rfc7530#section-13.2 ) so if pynfs complains, then it needs fixing regardless. RENEW, on the other hand, cannot return NFS4ERR_DELAY, but why would it need to? Either the lease is still valid, or else someone is already trying to tear it down due to an expiration event. I don't see why courtesy locks need to add any further complexity to that test. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx