Re: [PATCH RFC v5 0/2] nfsd: Initial implementation of NFSv4 Courteous Server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/29/21 5:42 PM, Chuck Lever III wrote:
On Nov 29, 2021, at 7:11 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:


On 11/29/21 1:10 PM, Chuck Lever III wrote:

On Nov 29, 2021, at 2:36 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:

On 11/29/21 11:03 AM, Chuck Lever III wrote:
Hello Dai!


On Nov 29, 2021, at 1:32 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:


On 11/29/21 9:30 AM, J. Bruce Fields wrote:
On Mon, Nov 29, 2021 at 09:13:16AM -0800, dai.ngo@xxxxxxxxxx wrote:
Hi Bruce,

On 11/21/21 7:04 PM, dai.ngo@xxxxxxxxxx wrote:
On 11/17/21 4:34 PM, J. Bruce Fields wrote:
On Wed, Nov 17, 2021 at 01:46:02PM -0800, dai.ngo@xxxxxxxxxx wrote:
On 11/17/21 9:59 AM, dai.ngo@xxxxxxxxxx wrote:
On 11/17/21 6:14 AM, J. Bruce Fields wrote:
On Tue, Nov 16, 2021 at 03:06:32PM -0800, dai.ngo@xxxxxxxxxx wrote:
Just a reminder that this patch is still waiting for your review.
Yeah, I was procrastinating and hoping yo'ud figure out the pynfs
failure for me....
Last time I ran 4.0 OPEN18 test by itself and it passed. I will run
all OPEN tests together with 5.15-rc7 to see if the problem you've
seen still there.
I ran all tests in nfsv4.1 and nfsv4.0 with courteous and non-courteous
5.15-rc7 server.

Nfs4.1 results are the same for both courteous and
non-courteous server:
Of those: 0 Skipped, 0 Failed, 0 Warned, 169 Passed
Results of nfs4.0 with non-courteous server:
Of those: 8 Skipped, 1 Failed, 0 Warned, 577 Passed
test failed: LOCK24

Results of nfs4.0 with courteous server:
Of those: 8 Skipped, 3 Failed, 0 Warned, 575 Passed
tests failed: LOCK24, OPEN18, OPEN30

OPEN18 and OPEN30 test pass if each is run by itself.
Could well be a bug in the tests, I don't know.
The reason OPEN18 failed was because the test timed out waiting for
the reply of an OPEN call. The RPC connection used for the test was
configured with 15 secs timeout. Note that OPEN18 only fails when
the tests were run with 'all' option, this test passes if it's run
by itself.

With courteous server, by the time OPEN18 runs, there are about 1026
courtesy 4.0 clients on the server and all of these clients have opened
the same file X with WRITE access. These clients were created by the
previous tests. After each test completed, since 4.0 does not have
session, the client states are not cleaned up immediately on the
server and are allowed to become courtesy clients.

When OPEN18 runs (about 20 minutes after the 1st test started), it
sends OPEN of file X with OPEN4_SHARE_DENY_WRITE which causes the
server to check for conflicts with courtesy clients. The loop that
checks 1026 courtesy clients for share/access conflict took less
than 1 sec. But it took about 55 secs, on my VM, for the server
to expire all 1026 courtesy clients.

I modified pynfs to configure the 4.0 RPC connection with 60 seconds
timeout and OPEN18 now consistently passed. The 4.0 test results are
now the same for courteous and non-courteous server:

8 Skipped, 1 Failed, 0 Warned, 577 Passed

Note that 4.1 tests do not suffer this timeout problem because the
4.1 clients and sessions are destroyed after each test completes.
Do you want me to send the patch to increase the timeout for pynfs?
or is there any other things you think we should do?
I don't know.

55 seconds to clean up 1026 clients is about 50ms per client, which is
pretty slow.  I wonder why.  I guess it's probably updating the stable
storage information.  Is /var/lib/nfs/ on your server backed by a hard
drive or an SSD or something else?
My server is a virtualbox VM that has 1 CPU, 4GB RAM and 64GB of hard
disk. I think a production system that supports this many clients should
have faster CPUs, faster storage.

I wonder if that's an argument for limiting the number of courtesy
clients.
I think we might want to treat 4.0 clients a bit different from 4.1
clients. With 4.0, every client will become a courtesy client after
the client is done with the export and unmounts it.
It should be safe for a server to purge a client's lease immediately
if there is no open or lock state associated with it.
In this case, each client has opened files so there are open states
associated with them.

When an NFSv4.0 client unmounts, all files should be closed at that
point,
I'm not sure pynfs does proper clean up after each subtest, I will
check. There must be state associated with the client in order for
it to become courtesy client.
Makes sense. Then a synthetic client like pynfs can DoS a courteous
server.


so the server can wait for the lease to expire and purge it
normally. Or am I missing something?
When 4.0 client lease expires and there are still states associated
with the client then the server allows this client to become courtesy
client.
I think the same thing happens if an NFSv4.1 client neglects to send
DESTROY_SESSION / DESTROY_CLIENTID. Either such a client is broken
or malicious, but the server faces the same issue of protecting
itself from a DoS attack.

IMO you should consider limiting the number of courteous clients
the server can hold onto. Let's say that number is 1000. When the
server wants to turn a 1001st client into a courteous client, it
can simply expire and purge the oldest courteous client on its
list. Otherwise, over time, the 24-hour expiry will reduce the
set of courteous clients back to zero.

What do you think?
Limiting the number of courteous clients to handle the cases of
broken/malicious 4.1 clients seems reasonable as the last resort.

I think if a malicious 4.1 clients could mount the server's export,
opens a file (to create state) and repeats the same with a different
client id then it seems like some basic security was already broken;
allowing unauthorized clients to mount server's exports.
You can do this today with AUTH_SYS. I consider it a genuine attack surface.


I think if we have to enforce a limit, then it's only for handling
of seriously buggy 4.1 clients which should not be the norm. The
issue with this is how to pick an optimal number that is suitable
for the running server which can be a very slow or a very fast server.

Note that even if we impose an limit, that does not completely solve
the problem with pynfs 4.0 test since its RPC timeout is configured
with 15 secs which just enough to expire 277 clients based on 53ms
for each client, unless we limit it ~270 clients which I think it's
too low.

This is what I plan to do:

1. do not support 4.0 courteous clients, for sure.
Not supporting 4.0 isn’t an option, IMHO. It is a fully supported protocol at this time, and the same exposure exists for 4.1, it’s just a little harder to exploit.

If you submit the courteous server patch without support for 4.0, I think it needs to include a plan for how 4.0 will be added later.

Seems like we should support both 4.0 and 4.x (x>=1) at the same time.



2. limit the number of courteous clients to 1000 (?), if you still
think we need it.
  I think this limit is necessary. It can be set based on the server’s physical memory size if a dynamic limit is desired.

Just to be clear, the problem of pynfs with 4.0 is that the server takes
~55 secs to expire 1026 4.0 courteous clients, which comes out to ~50ms
per client. This causes the test to time out in waiting for RPC reply of
the OPEN that triggers the conflicts.

I don't know exactly where the time spent in the process of expiring a
client. But as Bruce mentioned, it could be related to the time to access
/var/lib/nfs to remove the client's persistent record. I think that is most
likely the case because the number of states owned by each client should be
small since each test is short and does simple ops. So I think this problem
is related to the number of clients and not number of states owned by the
clients. This is not the memory resource shortage problem due to too many
state which we have planned to address it after the initial phase.

I'd vote to use a static limit for now, say 1000 clients, to avoid
complicating the courteous server code for something that would not
happen most of the time.

-Dai



Pls let me know what you think.

Thanks,
-Dai


Since there is
no destroy session/client with 4.0, the courteous server allows the
client to be around and becomes a courtesy client. So after awhile,
even with normal usage, there will be lots 4.0 courtesy clients
hanging around and these clients won't be destroyed until 24hrs
later, or until they cause conflicts with other clients.

We can reduce the courtesy_client_expiry time for 4.0 clients from
24hrs to 15/20 mins, enough for most network partition to heal?,
or limit the number of 4.0 courtesy clients. Or don't support 4.0
clients at all which is my preference since I think in general users
should skip 4.0 and use 4.1 instead.

-Dai
--
Chuck Lever



--
Chuck Lever






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux