Re: famfs port to fuse - questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/02/25 05:59AM, Amir Goldstein wrote:
> Sorry for html reply... Writing from mobile
> But wanted you to have the feedback pre RFC patch
> 
> On Mon, Feb 24, 2025, 4:25 PM John Groves <John@xxxxxxxxxx> wrote:
> 
> > Miklos et. al.:
> >
> > Here are some specific questions related to the famfs port into fuse [1][2]
> > that I hope Miklos (and others) can give me feedback on soonish.
> >
> > This work is active and serious, although you haven't heard much from me
> > recently. I'm showing a famfs poster at Usenix FAST '25 this week [3].
> >
> > I'm generally following the approach in [1] - in a famfs file system,
> > LOOKUP is followed by GET_FMAP to retrieve the famfs file/dax metadata.
> > It's tempting to merge the fmap into the LOOKUP reply, but this seems like
> > an optimization to consider once basic function is established.
> >
> > Q: Do you think it makes sense to make the famfs fmap an optional,
> >    variable sized addition to the LOOKUP response?
> >
> 
> Please see two other email threads from last few days about extending
> LOOKUP response for PASSTHROUGH to backing inode and related question on
> io_uring and PASSTHROUGH
> 
> I think the answer to you question, how to best extend LOOKUP response,
> should be bundled with the answer to the questions on those other threads.

Thanks for the quick reply, Amir. I'm traveling this week with a hectic
schedule (Usenix FAST++), so I won't get much actual work done.

I'll stipulate that my list-participation skills need improvement, but
I haven't figured out which threads those are. If somebody can point me
directly to them, you will have my gratitude.

Time frame for my initial RFC is probably still a few weeks out.

> 
> 
> > Whenever an fmap references a dax device that isn't already known to the
> > famfs/fuse kernel code, a GET_DAXDEV message is sent, with the reply
> > providing the info required to open teh daxdev. A file becomes available
> > when the fmap is complete and all referenced daxdevs are "opened".
> >
> > Q: Any heartburn here?
> >
> 
> See also similar discussions on those other email threads about alternative
> and more efficient APIs for mapping backing files.

Same, see above about my questionable list-searching skills

> 
> As Miklos said before mapping file ranges to backing file also makes sense
> for passthrough use cases so if there are similarities with GET_FMAP maybe
> there is room for sharing design goals, but I am not sure if this is the
> case?

I hope so...

> 
> 
> > When GET_FMAP is separate from LOOKUP, READDIRPLUS won't add value unless
> > it
> > receives fmaps as part of the attributes (i.e. lookups) that come back in
> > its response - since a READDIRPLUS that gets 12 files will still need 12
> > GET_FMAP messages/responses to be complete. Merging fmaps as optional,
> > variable-length components of the READDIRPLUS response buffers could
> > eventualy make sense, but a cleaner solution intially would seem to be
> > to disable READDIRPLUS in famfs. But...
> >
> > * The libfuse/kernel ABI appears to allow low-level fuse servers that don't
> >   support READDIRPLUS...
> > * But libfuse doesn't look at conn->want for the READDIRPLUS related
> >   capabilities
> > * I have overridden that, but the kernel still sends the READDIRPLUS
> >   messages. It's possible I'm doing something hinky, and I'll keep looking
> >   for it.
> > * When I just return -ENOSYS to READDIRPLUS, things don't work well. Still
> >   looking into this.
> >
> > Q: Do you know whether the current fuse kernel mod can handle a low-level
> >    fuse server that doesn't support READDIRPLUS? This may be broken.
> >
> 
> Did you try returning zero d_ino from readdirplus? I thinks that's the
> server way of saying I do not know how to reply as readdirplus.

No, but I'll try that when I get back home. I found documentation somewhere
that said ENOSYS was the answer.

> 
> I would have liked it if there was an FOPEN_NOREADDIRPLUS per opendir.
> This could be more efficient than having to get the first readdirplus
> request.

I don't know enough to have an opinion here (yet), but I'll start thinking
it through.

> 
> 
> > Q: If READDIRPLUS isn't actually optional, do you think the same attribute
> >    reply merge (attr + famfs fmap) is viable for READDIRPLUS? I'd very much
> >    like to do this part "later".
> 
> 
> > Q: Are fuse lowlevel file systems like famfs expected to use libfuse and
> > its
> >    message handling (particularly init), or is it preferred that they not
> >    do so? Seems a shame to throw away all that api version checking, but
> >    turning off READDIRPLUS would require changes that might affect other
> >    libfuse clients. Please advise...
> >
> > Note that the intended use cases for famfs generally involve large files
> > rather than *many* files, so giving up READDIRPLUS may not matter very
> > much,
> > at least in the early going.
> >
> 
> If famfs does not need readdirplus you should not have to deal with it.

Technically it's not that readdirplus can't be helpful for famfs - it's just
that readdirplus is not much help if the response doesn't include famfs
fmaps. 

But I do think the nicest bringup path would be to disable readdirplus, 
so what I should see is a bunch of READDIRs followed by (or alternating 
with?) a bunch of LOOKUPs. That route should aid bringup by avoiding 
new variable payloads on existing commands in the immediate term.

But if said payloads are being designed or considered, I should definitely
get the famfs requirements into that discussion, so we can do something that
works for all of us!

> 
> I don't see a problem with adding noreaddirplus config to libfuse, but not
> sure you even need this to be in init. I would prefer it per opendir.
> 
> Thanks,
> Amir.
> 
> 
> > I'm hoping to get an initial RFC patch set out in a few weeks, but these
> > questions address [some of] the open issues that need to be [at least
> > initially] resolved first.
> >
> >
> > Regards,
> > John
> >
> > [1]
> > https://lore.kernel.org/linux-fsdevel/20241029011308.24890-1-john@xxxxxxxxxx/
> > [2] https://lwn.net/Articles/983105/
> > [3] https://www.usenix.org/conference/fast25/poster-session
> >

Thanks,
John





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux