Re: [fuse-devel] [PATCH RESEND V12 3/8] fuse: Definitions and ioctl for passthrough

Amir Goldstein <amir73il@xxxxxxxxx> · Tue, 16 May 2023 11:48:42 +0300

On Tue, May 16, 2023 at 12:45 AM Paul Lawrence <paullawrence@xxxxxxxxxx> wrote:
>
> On Mon, May 15, 2023 at 2:11 PM Bernd Schubert
> <bernd.schubert@xxxxxxxxxxx> wrote:
> > On 5/15/23 22:16, Nikolaus Rath wrote:
> > > On May 15 2023, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >> On Mon, May 15, 2023 at 10:29 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> > >>> On Fri, 12 May 2023 at 21:37, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >>>
> > >>>> I was waiting for LSFMM to see if and how FUSE-BPF intends to
> > >>>> address the highest value use case of read/write passthrough.
> > >>>>
> > >>>>  From what I've seen, you are still taking a very broad approach of
> > >>>> all-or-nothing which still has a lot of core design issues to address,
> > >>>> while these old patches already address the most important use case
> > >>>> of read/write passthrough of fd without any of the core issues
> > >>>> (credentials, hidden fds).
> > >>>>
> > >>>> As far as I can tell, this old implementation is mostly independent of your
> > >>>> lookup based approach - they share the low level read/write passthrough
> > >>>> functions but not much more than that, so merging them should not be
> > >>>> a blocker to your efforts in the longer run.
> > >>>> Please correct me if I am wrong.
> > >>>>
> > >>>> As things stand, I intend to re-post these old patches with mandatory
> > >>>> FOPEN_PASSTHROUGH_AUTOCLOSE to eliminate the open
> > >>>> questions about managing mappings.
> > >>>>
> > >>>> Miklos, please stop me if I missed something and if you do not
> > >>>> think that these two approaches are independent.
> > >>>
> > >>> Do you mean that the BPF patches should use their own passthrough mechanism?
> > >>>
> > >>> I think it would be better if we could agree on a common interface for
> > >>> passthough (or per Paul's suggestion: backing) mechanism.
> > >>
> > >> Well, not exactly different.
> > >> With BFP patches, if you have a backing inode that was established during
> > >> LOOKUP with rules to do passthrough for open(), you'd get a backing file and
> > >> that backing file would be used to passthrough read/write.
> > >>
> > >> FOPEN_PASSTHROUGH is another way to configure passthrough read/write
> > >> to a backing file that is controlled by the server per open fd instead of by BFP
> > >> for every open of the backing inode.
> > >>
> > >> Obviously, both methods would use the same backing_file field and
> > >> same read/write passthrough methods regardless of how the backing file
> > >> was setup.
> > >>
> > >> Obviously, the BFP patches will not use the same ioctl to setup passthrough
> > >> (and/or BPF program) to a backing inode, but I don't think that matters much.
> > >> When we settle on ioctls for setting up backing inodes, we can also add new
> > >> ioctls for setting up backing file with optional BPF program.
> > >
> > >> I don't see any reason to make the first ioctl more complicated than this:
> > >>
> > >> struct fuse_passthrough_out {
> > >>          uint32_t        fd;
> > >>          /* For future implementation */
> > >>          uint32_t        len;
> > >>          void            *vec;
> > >> };
> > >>
> > >> One advantage with starting with FOPEN_PASSTHROUGH, besides
> > >> dealing with the highest priority performance issue, is how it deals with
> > >> resource limits on open files.
> > >
> > > One thing that struck me when we discussed FUSE-BPF at LSF was that from
> > > a userspace point of view, FUSE-BPF presents an almost completely
> > > different API than traditional FUSE (at least in its current form).
> > >
> > > As long as there is no support for falling back to standard FUSE
> > > callbacks, using FUSE-BPF means that most of the existing API no longer
> > > works, and instead there is a large new API surface that doesn't work in
> > > standard FUSE (the pre-filter and post-filter callbacks for each
> > > operation).

I think there is a confusion here that needs to be clarified.
I was confused when you asked in the session why the usermode
post-filter was needed.

IIUC, there is no usermode post filter. There are only in-kernel BPF
pre/post filters.

Paul/Daniel will correct me if I am wrong, but I think the FUSE server
can be called at most once per op as legacy FUSE, but with
FUSE-BPF, the server may be bypassed.

Pre/post filters are used to toggle the bypass mode permanently
or for a specific op and post filter can also be used to modify the
server response.

> > >
> > > I think this means that FUSE-BPF file systems won't work with FUSE, and
> > > FUSE filesystems won't work with FUSE-BPF.
> >
> > Is that so? I think found some incompatibilities in the patches (need to
> > double check), but doesn't it just do normal fuse operations and then
> > replies with an ioctl to do passthrough?

About that, I wanted to ask.
Alessio's initial patches used to have a similar approach.
Without ioctl, but the passthrough/backing fd was provided as part of the
response to OPEN request.

Following feedback from Miklos and Jens, not only the passthrough
request was moved to ioctl, but it was also decoupled from the OPEN
response.

This allows the server more flexibility in managing the passthrough
mode of files (or inodes in FUSE-BPF case).
FUSE-BPF patches use ioctl for response, but without decoupling.
I wonder if that should be amended for the next version?

> > BPF is used for additional
> > filtering, that would have to be done otherwise in userspace.
> >
> > Really difficult in the current patch set and data structures is to see
> > what is actually BPF and what is passthrough.
>
> I hope that fuse and fuse-bpf play together a little better than that
> ;) In the current design, you can set a backing file from within
> traditional fuse lookups, which moves you to fuse-bpf for that
> file/directory, and you can remove the backing file during the
> post-filter, moving that node back to fuse. You can also return a
> value from the bpf prefilter that tells fuse to use traditional fuse
> for that command. I think this is a very useful feature - it's one of
> the first ones we used in Android.
>
> If we do find any areas where we can't easily switch between
> traditional fuse and fuse-bpf, we would consider that a bug and fix it
> as fast as possible.
>
> And yes, we got the feedback from LSFMMBPF that the current patches
> are hard to follow, and we will be reordering them and resending them
> as three patchsets. One will add backing files, one will add backing
> directories, and the final will add bpf filters to both. Hopefully
> that will make them easier to understand.
>

That sounds great!
I started to dust off Alessio's patches.
I might just post what I have as a reference implementation that
we can compare to your "backing file" series.
I would much rather that your version is the one that ends up being
merged at the end ;-)

Thanks,
Amir.