(sorry, hate it when i rever to old habits and don't reply all) > > On Thu, Oct 24, 2019 at 12:22 AM Richard Weinberger > > <richard.weinberger@xxxxxxxxx> wrote: > >> > >> On Wed, Oct 23, 2019 at 11:56 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx= > wrote: > >> > Any atomicity that depends on journal commits cannot be considered t= o > >> > have atomicity in a boot context, because bootloaders don't do journ= al > >> > replay. It's completely ignored. > >> > >> It depends on the bootloader. If you care about atomicity you need to = handle > >> the journal. > >> There are also filesystems which *require* the journal to be handled. > >> In that case you can still replay to memory. > > > > I'm vaguely curious about examples of bootloaders that do journal > > replay, only because I can't think of any that apply. Certainly none > > that do replay on either ext4 or XFS. I've got some stale brain cells > > telling me there was at one time JBD code in GRUB for, I think ext3 > > journal replay (?) and all of that got ripped out a very long time > > ago. Maybe even before GRUB 2. > > U-boot, for example. Of course it does not so for any filesystem, but whe= re > it is needed and makes sense. Really? uboot does journal replay on ext3/4? I think at this point the most common file system on Linux distros is unquestionably ext4, and the most common bootloader is GRUB and for sure GRUB is no doing journal replay on anything, including ext4. > Another approach is using Linux as bootloader and kexec another kernel. > That way you can have a full filesystem implementation and bring the file= system > in a consistent state before reading from it. Sure the one or more file systems must be assumed to be dirty already. The EFI system partition on UEFI; and the FAT32 $BOOT on ARM; as well as the more conventional /boot which is ext4. Those must be assumed to be dirty with journal replay required. Yes they should have been cleanly unmounted and thus journal replay not required, but what if that's not the case? We can't really claim atomic updates in ideal cases, but rather worst case scenario. > > > > >> And yes, filesystem implementations in many bootloaders are in beyond > >> shameful state. > > > > Right. And while that's polite language, in their defence its just not > > their area of expertise. I tend to think that bootloader support is a > > burden primarily on file system folks. If you want this use case > > supported, then do the work. Ideally the upstreams would pair > > interested parties from each discipline to make this happen. But > > anyway, as I've heard it described by file system folks, it may not be > > practical to support it, in which case for the atomic update use case, > > the modern journaled file systems are just flat out disqualified. > > > > Which again leads me to FAT. We must have a solution that works there, > > even if it's some odd duck like thing, where the FAT ESP is > > essentially a static configuration, not changing, that points to some > > other block device (a different partition and different file system) > > that has the desired behavioral charactersistics. > > > >> > If a journal is present, is it appropriate to consider it a separate > >> > and optional part of the file system? > >> > >> No. This is filesystem specific. > > > > I understand it's optional for ext3/4 insofar as it can optionally be > > disabled, where on XFS it's compulsory. But mere presence of a journal > > doesn't mean replay is required, there's a file system specific flag > > that indicates replay is needed for the file system to be valid/cought > > up to date. To what degree a file system indicating journal replace is > > required, but can't be replayed, is still a valid file system isn't > > answered by file system metadata. The assumption is, replay must > > happen when indicated. So if a bootloader flat out can't do that, it > > essentially means the combination of GRUB2, das uboot, > > syslinux/extlinux and ext3/4 or XFS, is *proscribed* if the use case > > requires atomic kernel updates. Given the current state of affairs. > > > > So that leads me to, what about FAT? i.e. how does this get solved on > > FAT? And does that help us solve it on journaled file systems? If not, > > can it also be generic enough to solve it here? I'm actually not > > convinced it can be solved in journaled file systems at all, unless > > the bootloader can do journal replay, but I'm not a file system expert > > :P > > Like I mentioned above, use Linux as bootloader. > Have a minimal Linux kernel which can do kexec and the journaling filesys= tem > of your choice. Yeah that's got its own difficulties, including the way distro build systems work. I'm not opposed to it, but it's a practical barrier to adoption. I'd almost say it's easier to make Btrfs $BOOT compulsory, make static ESP compulsory, and voila! --=20 Chris Murphy