On Tue, Jun 6, 2023 at 12:49 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Tue, 6 Jun 2023 at 11:13, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > On Fri, May 19, 2023 at 3:57 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > Miklos, > > > > > > This patch set addresses your review feedback on Alesio's V12 patch set > > > from 2021 [1] as well as other bugs that I have found since. > > > This patch set uses refcounted backing files as we discussed recently [2]. > > > > > > I am posting this for several possible outcomes: > > > > > > 1. Either FUSE-BPF develpers can use this as a reference implementation > > > for their 1st phase of "backing file passthrough" > > > 2. Or they can tell me which API changes need to made to this patch set > > > so the API is flexible enough to extend to "backing inode passthrough" > > > and to "BPF filters" later on > > > 3. We find there is little overlap in the APIs and merge this as is > > > > > > These patches are available on github [3] along with libfuse patches [4]. > > > I tested them by running xfstests (./check -fuse -g quick.rw) with latest > > > libfuse xfstest support. > > > > > > Without FOPEN_PASSTHROUGH, one test in this group fails (generic/451) > > > which tests mixed buffered/aio writes. > > > With FOPEN_PASSTHROUGH, this test also passes. > > > > > > This revision does not set any limitations on the number of backing files > > > that can be mapped by the server. I considered several ways to address > > > this and decided to try a different approach. > > > > > > Patch 10 (with matching libfuse patch) is an RFC patch for an alternative > > > API approach. Please see my comments on that patch. > > > > > > > Miklos, > > > > I wanted to set expectations w.r.t this patch set and the passthrough > > feature development in general. > > > > So far I've seen comments from you up to path 5/10, so I assume you > > did not get up to RFC patch 10/10. > > > > The comments about adding max stack depth to protocol and about > > refactoring overlayfs common code are easy to do. > > > > However, I feel that there are still open core design questions that need > > to be spelled out, before we continue. > > > > Do you find the following acceptable for first implementation, or do you > > think that those issues must be addressed before merging anything? > > > > 1. No lsof visibility of backing files (if server closes them) > > 2. Derived backing files resource limit (cannot grow beyond nr of fuse files) > > 3. No data consistency guaranty between different fd to the same inode > > (i.e. backing is per fd not per inode) > > I think the most important thing is to have the FUSE-BPF team onboard. Yeh, I'd love to get some feedback from you guys. > I'm not sure that the per-file part of this is necessary, doing > everything per-inode should be okay. What are the benefits? > I agree that semantics are simpler with per-inode. The only benefit I see to per-file is the lifetime of the mapping. It is very easy IMO to program with a mapping scope of open-to-close that is requested by FOPEN_PASSTHROUGH and FOPEN_PASSTHROUGH_AUTO_CLOSE. But I think the same lifetime can still be achieved with per-inode mapping. I hand waved how I think that could be done in response to patch 10/10 review. I think if I can make this patch set work per-inode, the roadmap from here to FUSE-BPF would be much more clear. > Not having visibility and resource limits would be okay for a first > version, as long as it's somehow constrained to privileged use. But > I'm not sure it would be worth it that way. > Speaking on behalf of my own use case for FUSE passthrough (HSM), FUSE is used for "do something that does not belong in the kernel", but running as unprivileged user is a non-requirement. So I can say with confidence of paying customers that passthrough is useful and essential even with privileged user constraint. In summary, I will try to come up with v14 that is: - privileged user only - no resource limitation - per-inode mapping If at any time FUSE-BFP team would like to take this patch set of my hands, or propose a replacement for it, I would be very happy to step down - whatever it takes to land read/write passthrough. Thanks, Amir.