Re: [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Mon, 18 Sep 2023 10:26:24 -0700

On Mon, 18 Sept 2023 at 04:14, Jan Kara <jack@xxxxxxx> wrote:
>
> I agree. On the other hand each filesystem we carry imposes some
> maintenance burden (due to tree wide changes that are happening) and the
> question I have for some of them is: Do these filesystems actually bring
> any value?

I wouldn't be shocked if we could probably remove half of the
filesystems I listed, and nobody would even notice.

But at the same time, the actual upside to removing them is pretty
much zero. I do agree with you that reiserfs had issues - other than
the authorship - that made people much more inclined to remove it.

I'm looking at something like sysv, for example - the ancient old
14-byte filename thing. Does it have a single user? I really couldn't
tell. But at the same time, looking at the actual changes to it, they
fall into three categories:

 - trivial tree-wide changes - things like spelling fixes, or the SPDX
updates, or some "use common helpers"

 - VFS API updates, which are very straightforward (because sysvfs is
in no way doing anything odd)

 - some actual updates by Al Viro, who I doubt uses it, but I think
actually likes it and has some odd connection to it

anyway, I went back five years, and didn't see a single thing that
looked like "that was wasted time and effort".  There's a total of 44
patches over five years, so I'm looking at that filesystem and getting
a very strong feeling of "I think the minimal effort to maintain it
has been worth it".

Even without a single user, there's a history there, and it would be
sad to leave it behind. Exactly because it's _so_ little effort to
just keep.

Now, some of the other filesystems have gotten much more work done to
them - but it's because people have actively worked on them. rmk
actually did several adfs patch-series of cleanups etc back in 2019,
for example. Other than that, adfs seems to actually have gotten less
attention than sysvfs did, but I think that is probably because it
lacked the "Al Viro likes it" factor.

And something like befs - which has no knight in shining armor that
cares at all - has just a very small handful of one-liner patches for
VFS API changes.

So even the completely unloved ones just aren't a *burden*.

Reiserfs does stand out, as you say. There's a fair amount of actual
bug fixes and stuff there, because it's much more complicated, and
there were presumably a lot more complicated uses of it too due to the
history of it being an actual default distro filesystem for a while.

And that's kind of the other side of the picture: usage matters.
Something like affs or minixfs might still have a couple of users, but
those uses would basically be people who likely use Linux to interact
with some legacy machine they maintain..  So the usage they see would
mainly be very simple operations.

And that matters for two reasons:

 (a) we probably don't have to worry about bugs - security or
otherwise - as much. These are not generally "general-purpose"
filesystems. They are used for data transfer etc.

 (b) if they ever turn painful, we might be able to limit the pain further.

For example, mmap() is a very important operation in the general case,
and it actually causes a lot of potential problems from a filesystem
standpoint. It's one of the main sources of what little complexity
there is in the buffer head handling, for example.

But mmap() is *not* important for a filesystem that is used just for
data transport. I bet that FAT is still widely used, for example, and
while exFAT is probably making inroads, I suspect most of us have used
a USB stick with a FAT filesystem on it in the not too distant past.
Yet I doubt we'd have ever even noticed if 'mmap' didn't work on FAT.
Because all you really want for data transport is basic read/write
support.

And the reason I mention mmap is that it actually has some complexity
associated with it. If you support mmap, you have to have a read_folio
function, which in turn is why we have mpage_readpage(), which in turn
ends up being a noticeable part of the buffer cache code - any minor
complexity of the buffer cache does not tend to be about the
individual bh's themselves, but about the 'b_this_page' traversal, and
how buffers can be reached not just with sb_bread() and friends, but
are reachable from the VM through the page they are in.

IOW, *if* the buffer cache ever ends up being a big pain point, I
suspect that we'd still not want to remove ir, but it might be that we
could go "Hmm. Let's remove all the mmap support for the filesystems
that still use the buffer cache for data pages, because that causes
problems".

I think, for example, that ext4 - which obviously needs to continue to
support mmap, and which does use buffer heads in other parts - does
*not* use the buffer cache for actual data pages, only for metadata. I
might be wrong.

Anyway, based on the *current* situation, I don't actually see the
buffer cache even _remotely_ painful enough that we'd do even that
thing. It's not a small undertaking to get rid of the whole
b_this_page stuff and the complexity that comes from the page being
reachable through the VM layer (ie writepages etc). So it would be a
*lot* more work to rip that code out than it is to just support it.

         Linus