On 25/02/25 05:59AM, Amir Goldstein wrote: > Sorry for html reply... Writing from mobile > But wanted you to have the feedback pre RFC patch > > On Mon, Feb 24, 2025, 4:25 PM John Groves <John@xxxxxxxxxx> wrote: > > > Miklos et. al.: > > > > Here are some specific questions related to the famfs port into fuse [1][2] > > that I hope Miklos (and others) can give me feedback on soonish. > > > > This work is active and serious, although you haven't heard much from me > > recently. I'm showing a famfs poster at Usenix FAST '25 this week [3]. > > > > I'm generally following the approach in [1] - in a famfs file system, > > LOOKUP is followed by GET_FMAP to retrieve the famfs file/dax metadata. > > It's tempting to merge the fmap into the LOOKUP reply, but this seems like > > an optimization to consider once basic function is established. > > > > Q: Do you think it makes sense to make the famfs fmap an optional, > > variable sized addition to the LOOKUP response? > > > > Please see two other email threads from last few days about extending > LOOKUP response for PASSTHROUGH to backing inode and related question on > io_uring and PASSTHROUGH > > I think the answer to you question, how to best extend LOOKUP response, > should be bundled with the answer to the questions on those other threads. Thanks for the quick reply, Amir. I'm traveling this week with a hectic schedule (Usenix FAST++), so I won't get much actual work done. I'll stipulate that my list-participation skills need improvement, but I haven't figured out which threads those are. If somebody can point me directly to them, you will have my gratitude. Time frame for my initial RFC is probably still a few weeks out. > > > > Whenever an fmap references a dax device that isn't already known to the > > famfs/fuse kernel code, a GET_DAXDEV message is sent, with the reply > > providing the info required to open teh daxdev. A file becomes available > > when the fmap is complete and all referenced daxdevs are "opened". > > > > Q: Any heartburn here? > > > > See also similar discussions on those other email threads about alternative > and more efficient APIs for mapping backing files. Same, see above about my questionable list-searching skills > > As Miklos said before mapping file ranges to backing file also makes sense > for passthrough use cases so if there are similarities with GET_FMAP maybe > there is room for sharing design goals, but I am not sure if this is the > case? I hope so... > > > > When GET_FMAP is separate from LOOKUP, READDIRPLUS won't add value unless > > it > > receives fmaps as part of the attributes (i.e. lookups) that come back in > > its response - since a READDIRPLUS that gets 12 files will still need 12 > > GET_FMAP messages/responses to be complete. Merging fmaps as optional, > > variable-length components of the READDIRPLUS response buffers could > > eventualy make sense, but a cleaner solution intially would seem to be > > to disable READDIRPLUS in famfs. But... > > > > * The libfuse/kernel ABI appears to allow low-level fuse servers that don't > > support READDIRPLUS... > > * But libfuse doesn't look at conn->want for the READDIRPLUS related > > capabilities > > * I have overridden that, but the kernel still sends the READDIRPLUS > > messages. It's possible I'm doing something hinky, and I'll keep looking > > for it. > > * When I just return -ENOSYS to READDIRPLUS, things don't work well. Still > > looking into this. > > > > Q: Do you know whether the current fuse kernel mod can handle a low-level > > fuse server that doesn't support READDIRPLUS? This may be broken. > > > > Did you try returning zero d_ino from readdirplus? I thinks that's the > server way of saying I do not know how to reply as readdirplus. No, but I'll try that when I get back home. I found documentation somewhere that said ENOSYS was the answer. > > I would have liked it if there was an FOPEN_NOREADDIRPLUS per opendir. > This could be more efficient than having to get the first readdirplus > request. I don't know enough to have an opinion here (yet), but I'll start thinking it through. > > > > Q: If READDIRPLUS isn't actually optional, do you think the same attribute > > reply merge (attr + famfs fmap) is viable for READDIRPLUS? I'd very much > > like to do this part "later". > > > > Q: Are fuse lowlevel file systems like famfs expected to use libfuse and > > its > > message handling (particularly init), or is it preferred that they not > > do so? Seems a shame to throw away all that api version checking, but > > turning off READDIRPLUS would require changes that might affect other > > libfuse clients. Please advise... > > > > Note that the intended use cases for famfs generally involve large files > > rather than *many* files, so giving up READDIRPLUS may not matter very > > much, > > at least in the early going. > > > > If famfs does not need readdirplus you should not have to deal with it. Technically it's not that readdirplus can't be helpful for famfs - it's just that readdirplus is not much help if the response doesn't include famfs fmaps. But I do think the nicest bringup path would be to disable readdirplus, so what I should see is a bunch of READDIRs followed by (or alternating with?) a bunch of LOOKUPs. That route should aid bringup by avoiding new variable payloads on existing commands in the immediate term. But if said payloads are being designed or considered, I should definitely get the famfs requirements into that discussion, so we can do something that works for all of us! > > I don't see a problem with adding noreaddirplus config to libfuse, but not > sure you even need this to be in init. I would prefer it per opendir. > > Thanks, > Amir. > > > > I'm hoping to get an initial RFC patch set out in a few weeks, but these > > questions address [some of] the open issues that need to be [at least > > initially] resolved first. > > > > > > Regards, > > John > > > > [1] > > https://lore.kernel.org/linux-fsdevel/20241029011308.24890-1-john@xxxxxxxxxx/ > > [2] https://lwn.net/Articles/983105/ > > [3] https://www.usenix.org/conference/fast25/poster-session > > Thanks, John