> On Nov 29, 2021, at 2:36 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote: > > > On 11/29/21 11:03 AM, Chuck Lever III wrote: >> Hello Dai! >> >> >>> On Nov 29, 2021, at 1:32 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote: >>> >>> >>> On 11/29/21 9:30 AM, J. Bruce Fields wrote: >>>> On Mon, Nov 29, 2021 at 09:13:16AM -0800, dai.ngo@xxxxxxxxxx wrote: >>>>> Hi Bruce, >>>>> >>>>> On 11/21/21 7:04 PM, dai.ngo@xxxxxxxxxx wrote: >>>>>> On 11/17/21 4:34 PM, J. Bruce Fields wrote: >>>>>>> On Wed, Nov 17, 2021 at 01:46:02PM -0800, dai.ngo@xxxxxxxxxx wrote: >>>>>>>> On 11/17/21 9:59 AM, dai.ngo@xxxxxxxxxx wrote: >>>>>>>>> On 11/17/21 6:14 AM, J. Bruce Fields wrote: >>>>>>>>>> On Tue, Nov 16, 2021 at 03:06:32PM -0800, dai.ngo@xxxxxxxxxx wrote: >>>>>>>>>>> Just a reminder that this patch is still waiting for your review. >>>>>>>>>> Yeah, I was procrastinating and hoping yo'ud figure out the pynfs >>>>>>>>>> failure for me.... >>>>>>>>> Last time I ran 4.0 OPEN18 test by itself and it passed. I will run >>>>>>>>> all OPEN tests together with 5.15-rc7 to see if the problem you've >>>>>>>>> seen still there. >>>>>>>> I ran all tests in nfsv4.1 and nfsv4.0 with courteous and non-courteous >>>>>>>> 5.15-rc7 server. >>>>>>>> >>>>>>>> Nfs4.1 results are the same for both courteous and >>>>>>>> non-courteous server: >>>>>>>>> Of those: 0 Skipped, 0 Failed, 0 Warned, 169 Passed >>>>>>>> Results of nfs4.0 with non-courteous server: >>>>>>>>> Of those: 8 Skipped, 1 Failed, 0 Warned, 577 Passed >>>>>>>> test failed: LOCK24 >>>>>>>> >>>>>>>> Results of nfs4.0 with courteous server: >>>>>>>>> Of those: 8 Skipped, 3 Failed, 0 Warned, 575 Passed >>>>>>>> tests failed: LOCK24, OPEN18, OPEN30 >>>>>>>> >>>>>>>> OPEN18 and OPEN30 test pass if each is run by itself. >>>>>>> Could well be a bug in the tests, I don't know. >>>>>> The reason OPEN18 failed was because the test timed out waiting for >>>>>> the reply of an OPEN call. The RPC connection used for the test was >>>>>> configured with 15 secs timeout. Note that OPEN18 only fails when >>>>>> the tests were run with 'all' option, this test passes if it's run >>>>>> by itself. >>>>>> >>>>>> With courteous server, by the time OPEN18 runs, there are about 1026 >>>>>> courtesy 4.0 clients on the server and all of these clients have opened >>>>>> the same file X with WRITE access. These clients were created by the >>>>>> previous tests. After each test completed, since 4.0 does not have >>>>>> session, the client states are not cleaned up immediately on the >>>>>> server and are allowed to become courtesy clients. >>>>>> >>>>>> When OPEN18 runs (about 20 minutes after the 1st test started), it >>>>>> sends OPEN of file X with OPEN4_SHARE_DENY_WRITE which causes the >>>>>> server to check for conflicts with courtesy clients. The loop that >>>>>> checks 1026 courtesy clients for share/access conflict took less >>>>>> than 1 sec. But it took about 55 secs, on my VM, for the server >>>>>> to expire all 1026 courtesy clients. >>>>>> >>>>>> I modified pynfs to configure the 4.0 RPC connection with 60 seconds >>>>>> timeout and OPEN18 now consistently passed. The 4.0 test results are >>>>>> now the same for courteous and non-courteous server: >>>>>> >>>>>> 8 Skipped, 1 Failed, 0 Warned, 577 Passed >>>>>> >>>>>> Note that 4.1 tests do not suffer this timeout problem because the >>>>>> 4.1 clients and sessions are destroyed after each test completes. >>>>> Do you want me to send the patch to increase the timeout for pynfs? >>>>> or is there any other things you think we should do? >>>> I don't know. >>>> >>>> 55 seconds to clean up 1026 clients is about 50ms per client, which is >>>> pretty slow. I wonder why. I guess it's probably updating the stable >>>> storage information. Is /var/lib/nfs/ on your server backed by a hard >>>> drive or an SSD or something else? >>> My server is a virtualbox VM that has 1 CPU, 4GB RAM and 64GB of hard >>> disk. I think a production system that supports this many clients should >>> have faster CPUs, faster storage. >>> >>>> I wonder if that's an argument for limiting the number of courtesy >>>> clients. >>> I think we might want to treat 4.0 clients a bit different from 4.1 >>> clients. With 4.0, every client will become a courtesy client after >>> the client is done with the export and unmounts it. >> It should be safe for a server to purge a client's lease immediately >> if there is no open or lock state associated with it. > > In this case, each client has opened files so there are open states > associated with them. > >> >> When an NFSv4.0 client unmounts, all files should be closed at that >> point, > > I'm not sure pynfs does proper clean up after each subtest, I will > check. There must be state associated with the client in order for > it to become courtesy client. Makes sense. Then a synthetic client like pynfs can DoS a courteous server. >> so the server can wait for the lease to expire and purge it >> normally. Or am I missing something? > > When 4.0 client lease expires and there are still states associated > with the client then the server allows this client to become courtesy > client. I think the same thing happens if an NFSv4.1 client neglects to send DESTROY_SESSION / DESTROY_CLIENTID. Either such a client is broken or malicious, but the server faces the same issue of protecting itself from a DoS attack. IMO you should consider limiting the number of courteous clients the server can hold onto. Let's say that number is 1000. When the server wants to turn a 1001st client into a courteous client, it can simply expire and purge the oldest courteous client on its list. Otherwise, over time, the 24-hour expiry will reduce the set of courteous clients back to zero. What do you think? >>> Since there is >>> no destroy session/client with 4.0, the courteous server allows the >>> client to be around and becomes a courtesy client. So after awhile, >>> even with normal usage, there will be lots 4.0 courtesy clients >>> hanging around and these clients won't be destroyed until 24hrs >>> later, or until they cause conflicts with other clients. >>> >>> We can reduce the courtesy_client_expiry time for 4.0 clients from >>> 24hrs to 15/20 mins, enough for most network partition to heal?, >>> or limit the number of 4.0 courtesy clients. Or don't support 4.0 >>> clients at all which is my preference since I think in general users >>> should skip 4.0 and use 4.1 instead. >>> >>> -Dai >> -- >> Chuck Lever >> >> >> -- Chuck Lever