Re: [PATCH v13 00/10] fuse: Add support for passthrough read/write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 6 Jun 2023 at 13:19, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> On Tue, Jun 6, 2023 at 12:49 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >
> > On Tue, 6 Jun 2023 at 11:13, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >
> > > On Fri, May 19, 2023 at 3:57 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > > >
> > > > Miklos,
> > > >
> > > > This patch set addresses your review feedback on Alesio's V12 patch set
> > > > from 2021 [1] as well as other bugs that I have found since.
> > > > This patch set uses refcounted backing files as we discussed recently [2].
> > > >
> > > > I am posting this for several possible outcomes:
> > > >
> > > > 1. Either FUSE-BPF develpers can use this as a reference implementation
> > > >    for their 1st phase of "backing file passthrough"
> > > > 2. Or they can tell me which API changes need to made to this patch set
> > > >    so the API is flexible enough to extend to "backing inode passthrough"
> > > >    and to "BPF filters" later on
> > > > 3. We find there is little overlap in the APIs and merge this as is
> > > >
> > > > These patches are available on github [3] along with libfuse patches [4].
> > > > I tested them by running xfstests (./check -fuse -g quick.rw) with latest
> > > > libfuse xfstest support.
> > > >
> > > > Without FOPEN_PASSTHROUGH, one test in this group fails (generic/451)
> > > > which tests mixed buffered/aio writes.
> > > > With FOPEN_PASSTHROUGH, this test also passes.
> > > >
> > > > This revision does not set any limitations on the number of backing files
> > > > that can be mapped by the server.  I considered several ways to address
> > > > this and decided to try a different approach.
> > > >
> > > > Patch 10 (with matching libfuse patch) is an RFC patch for an alternative
> > > > API approach. Please see my comments on that patch.
> > > >
> > >
> > > Miklos,
> > >
> > > I wanted to set expectations w.r.t this patch set and the passthrough
> > > feature development in general.
> > >
> > > So far I've seen comments from you up to path 5/10, so I assume you
> > > did not get up to RFC patch 10/10.
> > >
> > > The comments about adding max stack depth to protocol and about
> > > refactoring overlayfs common code are easy to do.
> > >
> > > However, I feel that there are still open core design questions that need
> > > to be spelled out, before we continue.
> > >
> > > Do you find the following acceptable for first implementation, or do you
> > > think that those issues must be addressed before merging anything?
> > >
> > > 1. No lsof visibility of backing files (if server closes them)
> > > 2. Derived backing files resource limit (cannot grow beyond nr of fuse files)
> > > 3. No data consistency guaranty between different fd to the same inode
> > >     (i.e. backing is per fd not per inode)
> >
> > I think the most important thing is to have the FUSE-BPF team onboard.
>
> Yeh, I'd love to get some feedback from you guys.
>
> >    I'm not sure that the per-file part of this is necessary, doing
> > everything per-inode should be okay.   What are the benefits?
> >
>
> I agree that semantics are simpler with per-inode.
> The only benefit I see to per-file is the lifetime of the mapping.
>
> It is very easy IMO to program with a mapping scope of
> open-to-close that is requested by FOPEN_PASSTHROUGH
> and FOPEN_PASSTHROUGH_AUTO_CLOSE.

Right, and this case the resource limiting is also easy to think about.

I'm not worried about consistency, fuse server can do whatever it
wants with the data anyway.  I am worried about the visibility of the
mapping.  One idea would be to create a kernel thread for each fuse sb
instance and install mapped files into that thread's file table.  In
theory this should be trivial as the VFS has all the helpers that can
do this safely.

>
> But I think the same lifetime can still be achieved with per-inode
> mapping. I hand waved how I think that could be done in response
> to patch 10/10 review.

Yeah, but it's just complicating things further.


>
> I think if I can make this patch set work per-inode, the roadmap
> from here to FUSE-BPF would be much more clear.

One advantage of per-inode non-autoclose would be that open could be
done without a roundtrip to the server.   However the resource
limiting becomes harder to think about.

So it might make sense to just create two separate modes:

 - per-open unmap-on-release (autoclose)
 - per-inode unmap-on-forget (non-autoclose, mapping can be torn down
explicitly)

We also should at least consider (even if not supported in the first
version) the block-fuse case where the backing file(s) contain the
disk image.  In this case the backing fd registration would be per-fs
not per-inode, and the mapping would contain the
backing_fd/backing_offset pair.

>
> > Not having visibility and resource limits would be okay for a first
> > version, as long as it's somehow constrained to privileged use.  But
> > I'm not sure it would be worth it that way.
> >
>
> Speaking on behalf of my own use case for FUSE passthrough (HSM),
> FUSE is used for "do something that does not belong in the kernel",
> but running as unprivileged user is a non-requirement.
> So I can say with confidence of paying customers that passthrough is
> useful and essential even with privileged user constraint.
>
> In summary, I will try to come up with v14 that is:
> - privileged user only
> - no resource limitation
> - per-inode mapping

Okay, that's a logical first step.

Thanks,
Miklos




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux