Re: [PATCH v9 0/7] fuse: full atomic open and atomic-open-revalidate

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Thu, 21 Sep 2023 16:44:34 +0200

On 9/21/23 16:24, Amir Goldstein wrote:
On Thu, Sep 21, 2023 at 3:00 PM Bernd Schubert <bschubert@xxxxxxx> wrote:

On 9/21/23 11:33, Amir Goldstein wrote:
On Thu, Sep 21, 2023 at 9:31 AM Bernd Schubert <bschubert@xxxxxxx> wrote:

In FUSE, as of now, uncached lookups are expensive over the wire.
E.g additional latencies and stressing (meta data) servers from
thousands of clients. With atomic-open lookup before open
can be avoided.

Here is the link to performance numbers
https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@xxxxxxxxx/

Here is the libfuse pull request
https://github.com/libfuse/libfuse/pull/813

The patches are passing passthrough_hp xfstests (libfuse part applied),
although we had to introduce umount retries into xfstests, as recent
kernels/xfstests fail umount in some tests with
EBUSY - independent of atomic open. (Although outstanding for v7)

Hi Bernd!

I was using xfstests to test passthrough_hp (for FUSE kernel passthrough).
FYI, I have made some improvements to the mount helper
in libfuse [1] to support remount, which helps pass a few tests.

Thanks, just asked there to send it separate to upstream.


So far, I have all the tests in group -g quick.rw pass with the baseline
passthrough_hp (over xfs).

Do you have a baseline for the entire quick/auto group to share with me?

Please find my results attached.

Not too bad.
3 more tests can pass with my mount helper fix for remount ;)

I have opened a libfuse issue for generic/477,
(open_by_handle_at tests) but I'm not sure if this is passthrough_hp only (it
trusts the passed node id, without checking if there is an inode object for it).
Possibly fuse.ko passes an invalide node id - this is something for a rainy
weekend (or so) to investigate...

Stale file handles after mount cycle are expected.
FUSE is not equipped to handle this correctly.

I know and I don't have a problem with that. Issue is that the test triggers a
heap buffer overflow,  see the ASAN report here

https://github.com/libfuse/libfuse/issues/838

A possible reason might be an invalid node id by open_by_handle_at, or
lookup/release is not right. As I said, will investigate once I have a free
minute.


NFS clients may even get access to the wrong inode
after FUSE restart/reexport, if FUSE is exported with the same
NFS fsid.

See this discussion [3] about how this could be solved hackishly
with existing FUSE protocol (for fs that know how to open by ino)
and about the LOOKUP_HANDLE protocol command that is
needed to solve this in a generic way.

I will read through it later. I would prefer adding support up to
MAX_HANDLE_SZ - our file systems typically exceed 64 bit inode sizes.
Without having it read, I would just expose exportfs methods to userspace
(which might be the LOOKUP_HANDLE protocol).





Can you share the patch that you are using to avoid the EBUSY errors?


The simple version to avoid _most_ of EBUSY is this

diff --git a/common/rc b/common/rc
index 741579af..a40fca3b 100644
--- a/common/rc
+++ b/common/rc
@@ -305,6 +305,7 @@ _scratch_mount_idmapped()

   _scratch_unmount()
   {
+       sync
          case "$FSTYP" in
          overlay)
                  _overlay_scratch_unmount



The better version is this
https://github.com/kdave/xfstests/commit/33a15af07bb044e2773a83df1c7e0a0df280a4b7


Note that Chritian has suggested a method to use inotify
IN_UNMOUNT event to wait for sb shutdown in fstests [2].

Thanks, I had seen the discussion. Although I (silently) wondered if something
like MNT_BLOCk as umount2 flag wouldn't be easier.


You'd better keep wondering silently unless you want to upset Christian ;)

Ouch, Christian is in CC, inotify is fine ;)


Thanks,
Bernd