On Mon, Oct 05, 2020 at 10:03:06PM -0700, Josh Triplett wrote: > > I'm not trying to create a problem here; I'm trying to address a whole > family of problems. I was generally under the impression that mounting > existing root filesystems fell under the scope of the kernel<->userspace > or kernel<->existing-system boundary, as defined by what the kernel > accepts and existing userspace has used successfully, and that upgrading > the kernel should work with existing userspace and systems. If there's > some other rule that applies for filesystems, I'm not aware of that. > (I'm also not trying to suggest that every random corner case of what > the kernel *could* accept needs to be the format definition, but rather, > cases that correspond to existing userspace.) I'm not opposed to the kernel side change; it's *this time*. I'm more interested in killing off the tool that generated the malformed file system in the first place. As I keep pointing out, things aren't going to go well if "e2fsck -E unshare_blocks" is applied to it. So users who use this unofficial tool to create this file system is can run into at least this corner case, if not others, and that will result in, as the UI designers like to say, "a poor user experience". We had a similar issue with Android. Many years ago, Andy Rubin was originally quite allergic to the GPL, and had tried to promulgate the rule, "no GPL in Android Userspace". This is why bionic is used as libc, and this resulted in Android engineers (I think before the Google acquisition, but I'm not 100% sure), creating an unofficial, "unauthorized" make_ext4fs which was a BSD-licensed version of mke2fs. Unfortuantely, it created file systems which the kernel would never complain about, but which, 50% of the time, would result in a file system which under some circumstances, would get corrupted (even more) when e2fsck attempted to repair the file system. So if a user had a bit flip caused by an eMMC hiccup, e2fsck could end up making things worse. Worse, make_ext4fs had over time, grown extra functionality, such as pre-setting the SELinux xattrs, such that you couldn't just replace it with mke2fs. It took *years* to fix the problem, and that's why contrib/e2fsdroid exists today. We finally, a few years ago, were able to retire make_ext4fs and replace it with the combination of mke2fs and e2fsdroid. So that's why I really don't like it when there are "unauthorized", unofficial tools creating file systems out there which we are now obliged to support. Even if it's OK as far as the kernel is concerned, unless you're planning on forking and/or reimplementing all of e2fsprogs, and doing so correctly, that way is going to cause headaches for file system developers. As far as I'm concerned, it's not just about on-disk file system format, it's also about the official user space tools. If you create a file system which the kernel is happy with, but which wasn't created using the official user space tools, file systems are so full of state and permutations of how things should be done that the opportunities for mischief are huge. And what's especially aggravating is when it's done for petty reasons --- whether it's trying to sae an extra 0.0003% of storage, or because some VP was allergic to the GPL, it's stupid stuff. > I don't *want* to rely on what apparently turned out to be an > undocumented bug in the kernel's validator. That's why I was trying to > fix the issue in what seemed like the right way, by detecting the > situation and turning off the validator. That seemed like it would fully > address the issue. If it would help, I could also supply a tiny filesystem > image for regression testing. I'm OK with working around the problem, and we're lucky that it's this simple.... this time around. But can we *please* take your custom tool out back and shoot it in the head? Like make_ext4fs, it's just going to cause more headaches down the line. And perhaps we need to make a policy that makes it clear that for file systems, it's not just about "whatever the kernel happens to accept". It should also be, "was it generated using an official version of the userspace tools", at least as a consideration. Yes, we can try to make the kernel more strict, and that's a good thing to do, but inevitably, as we make the kernel more strict, we can potentially break other unffocial tools out there, and it's going to make it a lot harder to be able to do backwards compatible format enhancements to the file system. - Ted