Re: [fuse-devel] Proxmox + NFS w/ exported FUSE = EIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 7, 2024 at 2:08 AM Antonio SJ Musumeci <trapexit@xxxxxxxxxx> wrote:
>
> On 2/6/24 00:53, Amir Goldstein wrote:
> > On Tue, Feb 6, 2024 at 4:52 AM Antonio SJ Musumeci <trapexit@xxxxxxxxxx> wrote:
> >> Hi,
> >>
> >> Anyone have users exporting a FUSE filesystem over NFS? Particularly
> >> from Proxmox (recent release, kernel 6.5.11)? I've gotten a number of
> >> reports recently from individuals who have such a setup and after some
> >> time (not easily reproducible, seems usually after software like Plex or
> >> Jellyfin do a scan of media or a backup process) starts returning EIO
> >> errors. Not just from NFS but also when trying to access the FUSE mount
> >> as well. One person noted that they had moved from Ubuntu 18.04 (kernel
> >> 4.15.0) to Proxmox and on Ubuntu had no problems with otherwise the same
> >> settings.
> >>
> >> I've not yet been able to reproduced this issue myself but wanted to see
> >> if anyone else has run into this. As far as I can tell from what users
> >> have reported the FUSE server is still running but isn't receiving most
> >> requests. I do see evidence of statfs calls coming through but nothing
> >> else. Though the straces I've received typically are after the issues start.
> >>
> >> In an effort to rule out the FUSE server... is there anything the server
> >> could do to cause the kernel to return EIO and not forward anything but
> >> statfs? Doesn't seem to matter if direct_io is enabled or attr/entry
> >> caching is used.
> >>
> > This could be the outcome of commit 15db16837a35 ("fuse: fix illegal
> > access to inode with reused nodeid") in kernel v5.14.
> >
> > It is not an unintended regression - this behavior replaces what would
> > have been a potentially severe security violation with an EIO error.
> >
> > As the commit says:
> > "...With current code, this situation will not be detected and an old fuse
> >      dentry that used to point to an older generation real inode, can be used to
> >      access a completely new inode, which should be accessed only via the new
> >      dentry."
> >
> > I have made this fix after seeing users get the content of another
> > file from the one that they opened in NFS!
> >
> > libfuse commit 10ecd4f ("test/test_syscalls.c: check unlinked testfiles
> > at the end of the test") reproduces this problem in a test.
> > This test does not involve NFS export, but NFS export has higher
> > likelihood of exposing this issue.
> >
> > I wonder if the FUSE filesystems that report the errors have
> > FUSE_EXPORT_SUPPORT capability?
> > Not that this capability guarantees anything wrt to this issue.
> >
> > IMO, the root of all evil wrt NFS+FUSE is that LOOKUP is by ino
> > without generation with FUSE_EXPORT_SUPPORT, but worse
> > is that FUSE does not even require FUSE_EXPORT_SUPPORT
> > capability to export to NFS, but this is legacy FUSE behavior and
> > I am sure that many people export FUSE filesystems, as your
> > report proves.
> >
> > There is now a proposal for opt-out of NFS export:
> > https://lore.kernel.org/linux-fsdevel/20240126072120.71867-1-jefflexu@xxxxxxxxxxxxxxxxx/
> > so there will be a way for a FUSE filesystem to prevent misuse.
> >
> > Some practical suggestions for users running existing FUSE filesystems:
> >
> > - Never export a FUSE filesystem with a fixed fsid
> > - Everytime one wants to export a FUSE filesystem generate
> >    a oneshot fsid/uuid to use in exportfs
> > - Then restarting/re-exporting the FUSE filesystem will result in
> >    ESTALE errors on NFS client, but not security violations and not EIO
> >    errors
> > - This does not give full guarantee, unlinked inodes could still result
> >    in EIO errors, as the libfuse test demonstrates
> > - The situation with NFSv4 is slightly better than with NFSv3, because
> >     with NFSv3, an open file in the client does not keep the FUSE file
> >     open and increases the chance of evicted FUSE inode for an open
> >     NFS file
> >
> > Thanks,
> > Amir.
>
> Thank you Amir for such a detailed response. I'll look into this further
> but a few questions. To answer your question: yes, the server is setting
> EXPORT_SUPPORT.
>
> 1. The expected behavior, if the above situation occurred, is that the
> whole of the mount would return EIO? All requests going forward? What
> about FUSE_STATFS? From what I saw that was coming through.
>

It's only for a specific bad/stale inode which you have an open fd for
and trying to access, but another FUSE inode object already reused
its inode number.

> 2. Regarding the tests. I downloaded the latest libfuse, compiled, and
> ran test_syscalls against the FUSE server. I get no failures when
> running `./test_syscalls /mnt/fusemount :/mnt/ext4mount -u` or
> `./test_syscalls /mnt/fusemount -u` where ext4mount is the underlying
> filesystem and fusemount is the FUSE server's. No error is reported. A
> strace shows the fstat returning ESTALE at the end but the tests all
> pass. The mount continues to work after running the test. This is on
> kernel 6.5.0. Is that expected? It sounds from your description that I
> should be seeing EIOs somewhere.
>

It is expected.
The test says:

                        // With O_PATH fd, the server does not have to keep
                        // the inode alive so FUSE inode may be stale or bad
                        if (errno == ESTALE || errno == EIO ||
                            errno == ENOENT || errno == EBADF)
                                return 0;

So it is a matter of chance which error you get.
But those EIO errors are relatively rare, so if your users see them
across the fs, it's probably due to something else.

> 3. Thank you for the "practical suggestions". I will compare them to
> what my users are doing... but are there specific guidelines somewhere
> for building a FUSE server to ensure NFS export can be supported? This

I have implemented a library/fs-template for writing FUSE passthrough fs
that supports persistent NFS file handles (i.e. they survive server restart):

https://github.com/amir73il/libfuse/blob/fuse_passthrough/passthrough/fuse_passthrough.cpp

This is an implementation that assumes passthrough to ext4/xfs.
A generic implementation would require FUSE protocol change.

See: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiJ3qxb_XNWdmQPZ3omT3fjEhoMfG=3CSKucvoJbj6JSg@xxxxxxxxxxxxxx/

> topic has had limited details available over the years and I/users have
> had odd behaviors at times that were unclear of the cause. Like this
> situation or when NFS somehow triggered a request for '..' of the root
> nodeid (1). Some questions that come to mind: is the generation strictly
> necessary (practically) for things to work so long as nodeid is unique
> during a session (64bit nodeid space can last a long time)? Is there

The nodeid space is restarted on server restart and new nodeids are
assigned to same objects.

Using server inode numbers is more sane but as the test demonstrates
it is not always enough.

> possibility of conflict if multiple fuse servers used the same
> nodeid//gen pairs at the same time?

You cannot export two different fs with the same fsid/uuid at the same
time. NFS won't let you do that.

>  To what degree does the inode value
> matter? Should old node/gen pairs be kept around forever as noforget
> libfuse option suggests for NFS?

Does not matter.
As long as FUSE protocol does lookup by ino without generation
there is little that the server can do. It can only return the most
recent generation for that ino.

> Perhaps some of this is obvious but
> given changes to FUSE over time and the differences between kernel and
> userspace fs experiences it would be nice to have some of these more
> niche/complicated situations better flushed out in the official docs.
>

Would be nice if someone picked up that glove, but nothing about this
is trivial...

My plan was to contribute fuse_passthrough lib to the libfuse project,
but my focus has shifted and it requires some work yet to package this lib.

If someone is interested to take up this work and help maintain this
library, I am willing to help them.

Thanks,
Amir.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux