Re: [PATCH RFC v5 0/2] nfsd: Initial implementation of NFSv4 Courteous Server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 29, 2021 at 09:13:16AM -0800, dai.ngo@xxxxxxxxxx wrote:
> Hi Bruce,
> 
> On 11/21/21 7:04 PM, dai.ngo@xxxxxxxxxx wrote:
> >
> >On 11/17/21 4:34 PM, J. Bruce Fields wrote:
> >>On Wed, Nov 17, 2021 at 01:46:02PM -0800, dai.ngo@xxxxxxxxxx wrote:
> >>>On 11/17/21 9:59 AM, dai.ngo@xxxxxxxxxx wrote:
> >>>>On 11/17/21 6:14 AM, J. Bruce Fields wrote:
> >>>>>On Tue, Nov 16, 2021 at 03:06:32PM -0800, dai.ngo@xxxxxxxxxx wrote:
> >>>>>>Just a reminder that this patch is still waiting for your review.
> >>>>>Yeah, I was procrastinating and hoping yo'ud figure out the pynfs
> >>>>>failure for me....
> >>>>Last time I ran 4.0 OPEN18 test by itself and it passed. I will run
> >>>>all OPEN tests together with 5.15-rc7 to see if the problem you've
> >>>>seen still there.
> >>>I ran all tests in nfsv4.1 and nfsv4.0 with courteous and non-courteous
> >>>5.15-rc7 server.
> >>>
> >>>Nfs4.1 results are the same for both courteous and
> >>>non-courteous server:
> >>>>Of those: 0 Skipped, 0 Failed, 0 Warned, 169 Passed
> >>>Results of nfs4.0 with non-courteous server:
> >>>>Of those: 8 Skipped, 1 Failed, 0 Warned, 577 Passed
> >>>test failed: LOCK24
> >>>
> >>>Results of nfs4.0 with courteous server:
> >>>>Of those: 8 Skipped, 3 Failed, 0 Warned, 575 Passed
> >>>tests failed: LOCK24, OPEN18, OPEN30
> >>>
> >>>OPEN18 and OPEN30 test pass if each is run by itself.
> >>Could well be a bug in the tests, I don't know.
> >
> >The reason OPEN18 failed was because the test timed out waiting for
> >the reply of an OPEN call. The RPC connection used for the test was
> >configured with 15 secs timeout. Note that OPEN18 only fails when
> >the tests were run with 'all' option, this test passes if it's run
> >by itself.
> >
> >With courteous server, by the time OPEN18 runs, there are about 1026
> >courtesy 4.0 clients on the server and all of these clients have opened
> >the same file X with WRITE access. These clients were created by the
> >previous tests. After each test completed, since 4.0 does not have
> >session, the client states are not cleaned up immediately on the
> >server and are allowed to become courtesy clients.
> >
> >When OPEN18 runs (about 20 minutes after the 1st test started), it
> >sends OPEN of file X with OPEN4_SHARE_DENY_WRITE which causes the
> >server to check for conflicts with courtesy clients. The loop that
> >checks 1026 courtesy clients for share/access conflict took less
> >than 1 sec. But it took about 55 secs, on my VM, for the server
> >to expire all 1026 courtesy clients.
> >
> >I modified pynfs to configure the 4.0 RPC connection with 60 seconds
> >timeout and OPEN18 now consistently passed. The 4.0 test results are
> >now the same for courteous and non-courteous server:
> >
> >8 Skipped, 1 Failed, 0 Warned, 577 Passed
> >
> >Note that 4.1 tests do not suffer this timeout problem because the
> >4.1 clients and sessions are destroyed after each test completes.
> 
> Do you want me to send the patch to increase the timeout for pynfs?
> or is there any other things you think we should do?

I don't know.

55 seconds to clean up 1026 clients is about 50ms per client, which is
pretty slow.  I wonder why.  I guess it's probably updating the stable
storage information.  Is /var/lib/nfs/ on your server backed by a hard
drive or an SSD or something else?

I wonder if that's an argument for limiting the number of courtesy
clients.

--b.



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux