> On Dec 8, 2021, at 10:54 AM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote: > > On 12/6/21 11:55 AM, Chuck Lever III wrote: > >> >>> + >>> +/* >>> + * Function to check if the nfserr_share_denied error for 'fp' resulted >>> + * from conflict with courtesy clients then release their state to resolve >>> + * the conflict. >>> + * >>> + * Function returns: >>> + * 0 - no conflict with courtesy clients >>> + * >0 - conflict with courtesy clients resolved, try access/deny check again >>> + * -1 - conflict with courtesy clients being resolved in background >>> + * return nfserr_jukebox to NFS client >>> + */ >>> +static int >>> +nfs4_destroy_clnts_with_sresv_conflict(struct svc_rqst *rqstp, >>> + struct nfs4_file *fp, struct nfs4_ol_stateid *stp, >>> + u32 access, bool share_access) >>> +{ >>> + int cnt = 0; >>> + int async_cnt = 0; >>> + bool no_retry = false; >>> + struct nfs4_client *cl; >>> + struct list_head *pos, *next, reaplist; >>> + struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id); >>> + >>> + INIT_LIST_HEAD(&reaplist); >>> + spin_lock(&nn->client_lock); >>> + list_for_each_safe(pos, next, &nn->client_lru) { >>> + cl = list_entry(pos, struct nfs4_client, cl_lru); >>> + /* >>> + * check all nfs4_ol_stateid of this client >>> + * for conflicts with 'access'mode. >>> + */ >>> + if (nfs4_check_deny_bmap(cl, fp, stp, access, share_access)) { >>> + if (!test_bit(NFSD4_COURTESY_CLIENT, &cl->cl_flags)) { >>> + /* conflict with non-courtesy client */ >>> + no_retry = true; >>> + cnt = 0; >>> + goto out; >>> + } >>> + /* >>> + * if too many to resolve synchronously >>> + * then do the rest in background >>> + */ >>> + if (cnt > 100) { >>> + set_bit(NFSD4_DESTROY_COURTESY_CLIENT, &cl->cl_flags); >>> + async_cnt++; >>> + continue; >>> + } >>> + if (mark_client_expired_locked(cl)) >>> + continue; >>> + cnt++; >>> + list_add(&cl->cl_lru, &reaplist); >>> + } >>> + } >> Bruce suggested simply returning NFS4ERR_DELAY for all cases. >> That would simplify this quite a bit for what is a rare edge >> case. > > If we always do this asynchronously by returning NFS4ERR_DELAY > for all cases then the following pynfs tests need to be modified > to handle the error: > > RENEW3 st_renew.testExpired : FAILURE > LKU10 st_locku.testTimedoutUnlock : FAILURE > CLOSE9 st_close.testTimedoutClose2 : FAILURE > > and any new tests that opens file have to be prepared to handle > NFS4ERR_DELAY due to the lack of destroy_clientid in 4.0. > > Do we still want to take this approach? I'm still interested, but Bruce should chime in. Maybe Calum could have a look under the covers of pynfs and see how difficult the change might be. -- Chuck Lever