Re: [RFC PATCH 10/19] rust: fs: introduce `FileSystem::read_folio`

Wedson Almeida Filho <wedsonaf@xxxxxxxxx> · Tue, 7 Nov 2023 21:35:44 -0300

On Tue, 7 Nov 2023 at 19:22, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Nov 07, 2023 at 10:18:05PM +0000, Matthew Wilcox wrote:
> > On Wed, Oct 18, 2023 at 09:25:09AM -0300, Wedson Almeida Filho wrote:
> > > @@ -36,6 +39,9 @@ pub trait FileSystem {
> > >
> > >      /// Returns the inode corresponding to the directory entry with the given name.
> > >      fn lookup(parent: &INode<Self>, name: &[u8]) -> Result<ARef<INode<Self>>>;
> > > +
> > > +    /// Reads the contents of the inode into the given folio.
> > > +    fn read_folio(inode: &INode<Self>, folio: LockedFolio<'_>) -> Result;
> > >  }
> > >
> >
> > This really shouldn't be a per-filesystem operation.  We have operations
> > split up into mapping_ops, inode_ops and file_ops for a reason.  In this
> > case, read_folio() can have a very different implementation for, eg,
> > symlinks, directories and files.  So we want to have different aops
> > for each of symlinks, directories and files.  We should maintain that
> > separation for filesystems written in Rust too.  Unless there's a good
> > reason to change it, and then we should change it in C too.

read_folio() is only called for regular files and symlinks. All other
modes (directories, pipes, sockets, char devices, block devices) have
their own read callbacks that don't involve read_folio().

For the filesystems that we have in Rust today, reading the contents
of a symlink is the same as reading a file (i.e., the name of the link
target is stored the same way as data in a file). For cases when this
is different, read_folio() can of course just check the mode of the
inode and take the appropriate path.

This is also what a bunch of C file systems do. But you folks are the
ones with most experience in file systems, if you think this isn't a
good idea, we could use read_folio() only for regular files and
introduce a function for reading symblinks, say read_symlink().

> While we are at it, lookup is also very much not a per-filesystem operation.
> Take a look at e.g. procfs, for an obvious example...

The C api offers the greatest freedom: one could write a file system
where each file has its own set of mapping_ops, inode_ops and
file_ops; and while we could choose to replicate this freedom in Rust
but we haven't.

Mostly because we don't need it, and we've been repeatedly told (by
Greg KH and others) not to introduce abstractions/bindings for
anything for which there isn't a user. Besides being a longstanding
rule in the kernel, they also say that they can't reasonably decide if
the interfaces are good if they can't see the users.

The existing Rust users (tarfs and puzzlefs) only need a single
lookup. And a quick grep (git grep \\\.lookup\\\> -- fs/) appears to
show that the vast majority of C filesystems only have a single lookup
as well. So we choose simplicity, knowing well that we may have to
revisit it in the future if the needs change.

> Wait a minute... what in name of everything unholy is that thing doing tied
> to inodes in the first place?

For the same reason as above, we don't need it in our current
filesystems. A bunch of C ones (e.g., xfs, ext2, romfs, erofs) only
use the dentry to get the name and later call d_splice_alias(), so we
hide the name extraction and call to d_splice_alias() in the
"trampoline" function.

BTW, thank you Matthew and Al, I very much appreciate that you take
the time to look into and raise concerns.

Cheers,
-Wedson