Re: OPEN_XOR_DELEGATION performance problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Unfortunately, for now, nfs4j doesn't support any type of delegation.

Tigran.

----- Original Message -----
> From: "Cedric Blancher" <cedric.blancher@xxxxxxxxx>
> To: "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "Tiramisu Mokka" <kofemann@xxxxxxxxx>
> Sent: Wednesday, 20 November, 2024 08:39:00
> Subject: Re: OPEN_XOR_DELEGATION performance problems

> On Tue, 19 Nov 2024 at 17:31, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>>
>>
>>
>> > On Nov 19, 2024, at 10:09 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>> >
>> > On Tue, 2024-11-19 at 06:45 -0500, Jeff Layton wrote:
>> >> We attempted to implement the "delstid" draft for v6.13, but have had
>> >> to drop the patches for it. After merge, we got a couple of reports
>> >> of
>> >> a performance issue due to the OPEN_XOR_DELEGATION patch:
>> >>
>> >>
>> >> https://lore.kernel.org/linux-nfs/202409161645.d44bced5-oliver.sang@xxxxxxxxx/
>> >>
>> >> Once we enable OPEN_XOR_DELEGATION support, the fsmark "App Overhead"
>> >> statistic spikes significantly. The kernel patch for this is very
>> >> simple, and doesn't seem likely to cause a performance issue on its
>> >> own. My theory is that this test is one that causes the client to
>> >> return the delegation, and since it doesn't have an open stateid, it
>> >> has to reestablish one during the test run, and that causes the app
>> >> overhead stat to spike.
>> >>
>> >> Trond, Tom, Mike -- I know that the HS Anvil has support for
>> >> OPEN_XOR_DELEGATION. If you run the fsmark test against it with that
>> >> support both enabled and disabled (either on the client or server
>> >> side), do you see a similar spike in "App Overhead"?
>> >>
>> >> If so, then I suspect we need to consider limiting the use of that
>> >> flag
>> >> in some cases. I have no idea what heuristic we'd use to decide this
>> >> though.
>> >
>> > As already stated when we discussed this at Bakeathon: the server is
>> > still in charge of heuristics w.r.t. whether or not there may be
>> > contention for the file. The OPEN_XOR_DELEGATION flag changes nothing
>> > in that respect.
>>
>> fsmark is a single-client test. There should be no contention
>> for any files during this test.
>>
>>
>> > Yes, I'm sure you can find tests which cause recalls of delegations,
>> > and those will be marginally slower when the client has to re-establish
>> > an open stateid.
>>
>> The fsmark result regressed 92%.
>>
>>
>> > However the issue with those tests is that they are
>> > deliberately setting up a situation where the server ideally shouldn't
>> > be handing out a delegation at all.
>> >
>> > Furthermore, this is no different than a situation where the client
>> > used a delegation to cache the open (i.e. avoid sending an OPEN call)
>> > after the application closed the file and then later re-opened it.
>> > So the point is that this is not a situation that is unique to
>> > OPEN_XOR_DELEGATION. It is just a consequence of the client's ability
>> > to cache open state.
>>
>> The regression was bisected to Jeff's XOR patch on two
>> separate occasions. This does indeed appear to be a
>> situation that is unique to OPEN_XOR_DELEGATION.
>>
>> It's possible that our theory of the failure is wrong.
>> As developers of the only other server implementation of
>> OPEN_XOR_DELEGATION, can Hammerspace help us troubleshoot
>> this issue?
>>
> 
> Doesn't Tigran's dcache.org nfs4j server also support OPEN_XOR_DELEGATION?
> 
> Ced
> --
> Cedric Blancher <cedric.blancher@xxxxxxxxx>
> [https://plus.google.com/u/0/+CedricBlancher/]
> Institute Pasteur

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux