On Fri, Apr 20, 2018 at 3:32 PM, Lennart Poettering <mzerqung@xxxxxxxxxxx> wrote: > On Fr, 20.04.18 12:20, Chris Murphy (lists@xxxxxxxxxxxxxxxxx) wrote: > >> I'm honestly mystified why the plymouth commit hasn't been reverted in >> the interim. But I'm also mystified why the bootloader folks don't >> give a shit to commit their configuration files to disk when they know >> they can't do journal replay and have known that for 20 years. But >> then I'm also mystified why systemd developers won't fallback to >> freeze/thaw if rootfs remount-ro fails three times, instead of just >> giving up and forcing a reboot, they did discuss doing this a year ago >> and then poof, no action. > > Quite frankly, if you want to put the blame somewhere, I'd probably > place it with the xfs folks? I mean, there's a well-defined API on > linux for syncing a file system to disk so that it is in a clean > state, it's called sync(). Turns out that doesn't work though, it > doesn't actually do that. OK after a bunch of additional testing, one thing is certain; grubby and grub-mkconfig are not even remotely crash safe. They do not fsync() or sync() at all. That's not good no matter the file system. The best near term change that I think we should make is putting a sync() somewhere in grubby. I've tested modifying new-kernel-pkg to add a sync right before the last line, and # dnf update -y /path/to/kernelrpms && echo b > /proc/sysrq-trigger With very limited testing (dozens of reboots in multiple fs configurations of a VM only, using unsafe caching), all configurations including XFS are better off with sync() added. It a highly non-deterministic mess to sort out without sync(). But with sync() and with /boot on XFS: the kernel, initramfs, or grub.cfg - there is an unknown (let's say 50/50) probability the bootloader can't find or read those files despite the sync(). The most common outcome I ran into was a partially modified grub.cfg containing an incomplete default entry for the new kernel which is missing the grubby insertion for the initrd. So it can't boot. But if the grub menu appears at all, the previous kernel and rescue entries are still there, and they work. Log replay during boot fixes up everything and now the next boot works fine by default without user interaction. So even though sync() is still not good enough for XFS always, it's way better than no sync(). Meanwhile FAT, ext2, ext4, and Btrfs all appear to flush to disk with sync() sufficient for the bootloader to have no problems. Note that fsync() is actually insufficient at least some of the time on all journaled file systems. e.g. dracut -f only does fsync() on the copied over initramfs to /boot and that new initramfs is often not seen by the bootloader for journaled file systems. Instead, the bootloader sees the old initramfs that had been deleted, which during boot (if successful) replays the log and completes the delete of old initramfs, and now the new one appears and isused at next boot. Other combinations are possible. Minor in comparison, I see grubby doing a metric ton of fdatasync() on /var/log/grubby - 80 fdatasync() 's for 17 lines and 1.2KiB of changes for a single kernel installation. Does that make sense to anyone? How about one fdatasync() for all of those changes? Or just fsync() the file? I think any plan for doing freeze/thaw anywhere can be postponed pending further discussion and testing. In the meantime I'll come up with a patch for new-kernel-pkg to do a sync() unless someone else beats me to it. -- Chris Murphy _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx