Re: [PATCH RFC v5 0/2] nfsd: Initial implementation of NFSv4 Courteous Server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Nov 29, 2021, at 7:11 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> 
> 
>> On 11/29/21 1:10 PM, Chuck Lever III wrote:
>> 
>>>> On Nov 29, 2021, at 2:36 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>> 
>>> 
>>> On 11/29/21 11:03 AM, Chuck Lever III wrote:
>>>> Hello Dai!
>>>> 
>>>> 
>>>>> On Nov 29, 2021, at 1:32 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>>>> 
>>>>> 
>>>>> On 11/29/21 9:30 AM, J. Bruce Fields wrote:
>>>>>> On Mon, Nov 29, 2021 at 09:13:16AM -0800, dai.ngo@xxxxxxxxxx wrote:
>>>>>>> Hi Bruce,
>>>>>>> 
>>>>>>> On 11/21/21 7:04 PM, dai.ngo@xxxxxxxxxx wrote:
>>>>>>>> On 11/17/21 4:34 PM, J. Bruce Fields wrote:
>>>>>>>>> On Wed, Nov 17, 2021 at 01:46:02PM -0800, dai.ngo@xxxxxxxxxx wrote:
>>>>>>>>>> On 11/17/21 9:59 AM, dai.ngo@xxxxxxxxxx wrote:
>>>>>>>>>>> On 11/17/21 6:14 AM, J. Bruce Fields wrote:
>>>>>>>>>>>> On Tue, Nov 16, 2021 at 03:06:32PM -0800, dai.ngo@xxxxxxxxxx wrote:
>>>>>>>>>>>>> Just a reminder that this patch is still waiting for your review.
>>>>>>>>>>>> Yeah, I was procrastinating and hoping yo'ud figure out the pynfs
>>>>>>>>>>>> failure for me....
>>>>>>>>>>> Last time I ran 4.0 OPEN18 test by itself and it passed. I will run
>>>>>>>>>>> all OPEN tests together with 5.15-rc7 to see if the problem you've
>>>>>>>>>>> seen still there.
>>>>>>>>>> I ran all tests in nfsv4.1 and nfsv4.0 with courteous and non-courteous
>>>>>>>>>> 5.15-rc7 server.
>>>>>>>>>> 
>>>>>>>>>> Nfs4.1 results are the same for both courteous and
>>>>>>>>>> non-courteous server:
>>>>>>>>>>> Of those: 0 Skipped, 0 Failed, 0 Warned, 169 Passed
>>>>>>>>>> Results of nfs4.0 with non-courteous server:
>>>>>>>>>>> Of those: 8 Skipped, 1 Failed, 0 Warned, 577 Passed
>>>>>>>>>> test failed: LOCK24
>>>>>>>>>> 
>>>>>>>>>> Results of nfs4.0 with courteous server:
>>>>>>>>>>> Of those: 8 Skipped, 3 Failed, 0 Warned, 575 Passed
>>>>>>>>>> tests failed: LOCK24, OPEN18, OPEN30
>>>>>>>>>> 
>>>>>>>>>> OPEN18 and OPEN30 test pass if each is run by itself.
>>>>>>>>> Could well be a bug in the tests, I don't know.
>>>>>>>> The reason OPEN18 failed was because the test timed out waiting for
>>>>>>>> the reply of an OPEN call. The RPC connection used for the test was
>>>>>>>> configured with 15 secs timeout. Note that OPEN18 only fails when
>>>>>>>> the tests were run with 'all' option, this test passes if it's run
>>>>>>>> by itself.
>>>>>>>> 
>>>>>>>> With courteous server, by the time OPEN18 runs, there are about 1026
>>>>>>>> courtesy 4.0 clients on the server and all of these clients have opened
>>>>>>>> the same file X with WRITE access. These clients were created by the
>>>>>>>> previous tests. After each test completed, since 4.0 does not have
>>>>>>>> session, the client states are not cleaned up immediately on the
>>>>>>>> server and are allowed to become courtesy clients.
>>>>>>>> 
>>>>>>>> When OPEN18 runs (about 20 minutes after the 1st test started), it
>>>>>>>> sends OPEN of file X with OPEN4_SHARE_DENY_WRITE which causes the
>>>>>>>> server to check for conflicts with courtesy clients. The loop that
>>>>>>>> checks 1026 courtesy clients for share/access conflict took less
>>>>>>>> than 1 sec. But it took about 55 secs, on my VM, for the server
>>>>>>>> to expire all 1026 courtesy clients.
>>>>>>>> 
>>>>>>>> I modified pynfs to configure the 4.0 RPC connection with 60 seconds
>>>>>>>> timeout and OPEN18 now consistently passed. The 4.0 test results are
>>>>>>>> now the same for courteous and non-courteous server:
>>>>>>>> 
>>>>>>>> 8 Skipped, 1 Failed, 0 Warned, 577 Passed
>>>>>>>> 
>>>>>>>> Note that 4.1 tests do not suffer this timeout problem because the
>>>>>>>> 4.1 clients and sessions are destroyed after each test completes.
>>>>>>> Do you want me to send the patch to increase the timeout for pynfs?
>>>>>>> or is there any other things you think we should do?
>>>>>> I don't know.
>>>>>> 
>>>>>> 55 seconds to clean up 1026 clients is about 50ms per client, which is
>>>>>> pretty slow.  I wonder why.  I guess it's probably updating the stable
>>>>>> storage information.  Is /var/lib/nfs/ on your server backed by a hard
>>>>>> drive or an SSD or something else?
>>>>> My server is a virtualbox VM that has 1 CPU, 4GB RAM and 64GB of hard
>>>>> disk. I think a production system that supports this many clients should
>>>>> have faster CPUs, faster storage.
>>>>> 
>>>>>> I wonder if that's an argument for limiting the number of courtesy
>>>>>> clients.
>>>>> I think we might want to treat 4.0 clients a bit different from 4.1
>>>>> clients. With 4.0, every client will become a courtesy client after
>>>>> the client is done with the export and unmounts it.
>>>> It should be safe for a server to purge a client's lease immediately
>>>> if there is no open or lock state associated with it.
>>> In this case, each client has opened files so there are open states
>>> associated with them.
>>> 
>>>> When an NFSv4.0 client unmounts, all files should be closed at that
>>>> point,
>>> I'm not sure pynfs does proper clean up after each subtest, I will
>>> check. There must be state associated with the client in order for
>>> it to become courtesy client.
>> Makes sense. Then a synthetic client like pynfs can DoS a courteous
>> server.
>> 
>> 
>>>> so the server can wait for the lease to expire and purge it
>>>> normally. Or am I missing something?
>>> When 4.0 client lease expires and there are still states associated
>>> with the client then the server allows this client to become courtesy
>>> client.
>> I think the same thing happens if an NFSv4.1 client neglects to send
>> DESTROY_SESSION / DESTROY_CLIENTID. Either such a client is broken
>> or malicious, but the server faces the same issue of protecting
>> itself from a DoS attack.
>> 
>> IMO you should consider limiting the number of courteous clients
>> the server can hold onto. Let's say that number is 1000. When the
>> server wants to turn a 1001st client into a courteous client, it
>> can simply expire and purge the oldest courteous client on its
>> list. Otherwise, over time, the 24-hour expiry will reduce the
>> set of courteous clients back to zero.
>> 
>> What do you think?
> 
> Limiting the number of courteous clients to handle the cases of
> broken/malicious 4.1 clients seems reasonable as the last resort.
> 
> I think if a malicious 4.1 clients could mount the server's export,
> opens a file (to create state) and repeats the same with a different
> client id then it seems like some basic security was already broken;
> allowing unauthorized clients to mount server's exports.

You can do this today with AUTH_SYS. I consider it a genuine attack surface.


> I think if we have to enforce a limit, then it's only for handling
> of seriously buggy 4.1 clients which should not be the norm. The
> issue with this is how to pick an optimal number that is suitable
> for the running server which can be a very slow or a very fast server.
> 
> Note that even if we impose an limit, that does not completely solve
> the problem with pynfs 4.0 test since its RPC timeout is configured
> with 15 secs which just enough to expire 277 clients based on 53ms
> for each client, unless we limit it ~270 clients which I think it's
> too low.
> 
> This is what I plan to do:
> 
> 1. do not support 4.0 courteous clients, for sure.

Not supporting 4.0 isn’t an option, IMHO. It is a fully supported protocol at this time, and the same exposure exists for 4.1, it’s just a little harder to exploit.

If you submit the courteous server patch without support for 4.0, I think it needs to include a plan for how 4.0 will be added later.


> 2. limit the number of courteous clients to 1000 (?), if you still
> think we need it.

 I think this limit is necessary. It can be set based on the server’s physical memory size if a dynamic limit is desired.


> Pls let me know what you think.
> 
> Thanks,
> -Dai
> 
>> 
>> 
>>>>> Since there is
>>>>> no destroy session/client with 4.0, the courteous server allows the
>>>>> client to be around and becomes a courtesy client. So after awhile,
>>>>> even with normal usage, there will be lots 4.0 courtesy clients
>>>>> hanging around and these clients won't be destroyed until 24hrs
>>>>> later, or until they cause conflicts with other clients.
>>>>> 
>>>>> We can reduce the courtesy_client_expiry time for 4.0 clients from
>>>>> 24hrs to 15/20 mins, enough for most network partition to heal?,
>>>>> or limit the number of 4.0 courtesy clients. Or don't support 4.0
>>>>> clients at all which is my preference since I think in general users
>>>>> should skip 4.0 and use 4.1 instead.
>>>>> 
>>>>> -Dai
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>>>> 
>> --
>> Chuck Lever
>> 
>> 
>> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux