Re: [PATCH RFC v6 2/2] nfsd: Initial implementation of NFSv4 Courteous Server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 08, 2021 at 08:25:28AM -0800, dai.ngo@xxxxxxxxxx wrote:
> 
> On 12/8/21 8:16 AM, Trond Myklebust wrote:
> >On Wed, 2021-12-08 at 07:54 -0800, dai.ngo@xxxxxxxxxx wrote:
> >>On 12/6/21 11:55 AM, Chuck Lever III wrote:
> >>
> >>>>+
> >>>>+/*
> >>>>+ * Function to check if the nfserr_share_denied error for 'fp'
> >>>>resulted
> >>>>+ * from conflict with courtesy clients then release their state to
> >>>>resolve
> >>>>+ * the conflict.
> >>>>+ *
> >>>>+ * Function returns:
> >>>>+ *      0 -  no conflict with courtesy clients
> >>>>+ *     >0 -  conflict with courtesy clients resolved, try
> >>>>access/deny check again
> >>>>+ *     -1 -  conflict with courtesy clients being resolved in
> >>>>background
> >>>>+ *            return nfserr_jukebox to NFS client
> >>>>+ */
> >>>>+static int
> >>>>+nfs4_destroy_clnts_with_sresv_conflict(struct svc_rqst *rqstp,
> >>>>+                       struct nfs4_file *fp, struct
> >>>>nfs4_ol_stateid *stp,
> >>>>+                       u32 access, bool share_access)
> >>>>+{
> >>>>+       int cnt = 0;
> >>>>+       int async_cnt = 0;
> >>>>+       bool no_retry = false;
> >>>>+       struct nfs4_client *cl;
> >>>>+       struct list_head *pos, *next, reaplist;
> >>>>+       struct nfsd_net *nn = net_generic(SVC_NET(rqstp),
> >>>>nfsd_net_id);
> >>>>+
> >>>>+       INIT_LIST_HEAD(&reaplist);
> >>>>+       spin_lock(&nn->client_lock);
> >>>>+       list_for_each_safe(pos, next, &nn->client_lru) {
> >>>>+               cl = list_entry(pos, struct nfs4_client, cl_lru);
> >>>>+               /*
> >>>>+                * check all nfs4_ol_stateid of this client
> >>>>+                * for conflicts with 'access'mode.
> >>>>+                */
> >>>>+               if (nfs4_check_deny_bmap(cl, fp, stp, access,
> >>>>share_access)) {
> >>>>+                       if (!test_bit(NFSD4_COURTESY_CLIENT, &cl-
> >>>>>cl_flags)) {
> >>>>+                               /* conflict with non-courtesy
> >>>>client */
> >>>>+                               no_retry = true;
> >>>>+                               cnt = 0;
> >>>>+                               goto out;
> >>>>+                       }
> >>>>+                       /*
> >>>>+                        * if too many to resolve synchronously
> >>>>+                        * then do the rest in background
> >>>>+                        */
> >>>>+                       if (cnt > 100) {
> >>>>+                               set_bit(NFSD4_DESTROY_COURTESY_CLIE
> >>>>NT, &cl->cl_flags);
> >>>>+                               async_cnt++;
> >>>>+                               continue;
> >>>>+                       }
> >>>>+                       if (mark_client_expired_locked(cl))
> >>>>+                               continue;
> >>>>+                       cnt++;
> >>>>+                       list_add(&cl->cl_lru, &reaplist);
> >>>>+               }
> >>>>+       }
> >>>Bruce suggested simply returning NFS4ERR_DELAY for all cases.
> >>>That would simplify this quite a bit for what is a rare edge
> >>>case.
> >>If we always do this asynchronously by returning NFS4ERR_DELAY
> >>for all cases then the following pynfs tests need to be modified
> >>to handle the error:
> >>
> >>RENEW3   st_renew.testExpired                                     :
> >>FAILURE
> >>LKU10    st_locku.testTimedoutUnlock                              :
> >>FAILURE
> >>CLOSE9   st_close.testTimedoutClose2                              :
> >>FAILURE
> >>
> >>and any new tests that opens file have to be prepared to handle
> >>NFS4ERR_DELAY due to the lack of destroy_clientid in 4.0.
> >>
> >>Do we still want to take this approach?
> >NFS4ERR_DELAY is a valid error for both CLOSE and LOCKU (see RFC7530
> >section 13.2 https://urldefense.com/v3/__https://datatracker.ietf.org/doc/html/rfc7530*section-13.2__;Iw!!ACWV5N9M2RV99hQ!f8vZHAJophxXdSSJvnxDCSBSRpWFxEOZBo2ZLvjPzXLVrvMYR8RKcc0_Jvjhng$
> >) so if pynfs complains, then it needs fixing regardless.
> >
> >RENEW, on the other hand, cannot return NFS4ERR_DELAY, but why would it
> >need to? Either the lease is still valid, or else someone is already
> >trying to tear it down due to an expiration event. I don't see why
> >courtesy locks need to add any further complexity to that test.
> 
> RENEW fails in the 2nd open:
> 
>     c.create_confirm(t.word(), access=OPEN4_SHARE_ACCESS_BOTH,
>                      deny=OPEN4_SHARE_DENY_BOTH)     <<======   DENY_BOTH
>     sleeptime = c.getLeaseTime() * 2
>     env.sleep(sleeptime)
>     c2 = env.c2
>     c2.init_connection()
>     c2.open_confirm(t.word(), access=OPEN4_SHARE_ACCESS_READ,    <<=== needs to handle NFS4ERR_DELAY
>                     deny=OPEN4_SHARE_DENY_NONE)
> 
> CLOSE and LOCKU also fail in the OPEN, similar to the RENEW test.
> Any new pynfs 4.0 test that does open might get NFS4ERR_DELAY.

So it's a RENEW test, not the RENEW operation.

A general-purpose client always has to be prepared for DELAY on OPEN.
But pynfs isn't a general-purpose client, and it assumes that it's the
only one touching the files and directories it creates.

Within pynfs we've got a problem that the tests don't necessarily clean
up after themselves completely, so in theory a test could interfere with
later results.

But each test uses its own files--e.g. in the fragment above note that
the file it's testing gets the name t.word(), which is by design unique
to that test.  So it shouldn't be hitting any conflicts with state held
by previous tests.  Am I missing something?

--b.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux