Re: Fedora 27 kernel updates make system unbootable (sort of)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Fri, Apr 20, 2018 at 3:32 PM, Lennart Poettering
> <mzerqung(a)0pointer.de&gt; wrote:
> 
> OK after a bunch of additional testing, one thing is certain; grubby
> and grub-mkconfig are not even remotely crash safe. They do not
> fsync() or sync() at all. That's not good no matter the file system.

May I point you to https://github.com/rhboot/grubby/pull/24 which *should* fix most of that.

https://github.com/rhboot/grubby/pull/28 will fix *all* of that when /boot is a separate FS.

> The best near term change that I think we should make is putting a
> sync() somewhere in grubby. I've tested modifying new-kernel-pkg to
> add a sync right before the last line, and
> 
> # dnf update -y /path/to/kernelrpms && echo b > /proc/sysrq-trigger
> 
> With very limited testing (dozens of reboots in multiple fs
> configurations of a VM only, using unsafe caching), all configurations
> including XFS are better off with sync() added. It a highly
> non-deterministic mess to sort out without sync().
> 
> But with sync() and with /boot on XFS: the kernel, initramfs, or
> grub.cfg - there is an unknown (let's say 50/50) probability the
> bootloader can't find or read those files despite the sync(). The most
> common outcome I ran into was a partially modified grub.cfg containing
> an incomplete default entry for the new kernel which is missing the
> grubby insertion for the initrd. So it can't boot. But if the grub
> menu appears at all, the previous kernel and rescue entries are still
> there, and they work. Log replay during boot fixes up everything and
> now the next boot works fine by default without user interaction. So
> even though sync() is still not good enough for XFS always, it's way
> better than no sync().
> 
> Meanwhile FAT, ext2, ext4, and Btrfs all appear to flush to disk with
> sync() sufficient for the bootloader to have no problems.
> 
> Note that fsync() is actually insufficient at least some of the time
> on all journaled file systems. e.g. dracut -f only does fsync() on the
> copied over initramfs to /boot and that new initramfs is often not
> seen by the bootloader for journaled file systems.

Not after https://github.com/dracutdevs/dracut/commit/58e3971b920fbb60e5a90edfd30aa887f9818100

> Instead, the
> bootloader sees the old initramfs that had been deleted, which during
> boot (if successful) replays the log and completes the delete of old
> initramfs, and now the new one appears and isused at next boot. Other
> combinations are possible.
> 
> Minor in comparison, I see grubby doing a metric ton of fdatasync() on
> /var/log/grubby -  80 fdatasync() 's for 17 lines and 1.2KiB of
> changes for a single kernel installation. Does that make sense to
> anyone? How about one fdatasync() for all of those changes? Or just
> fsync() the file?
> 
> I think any plan for doing freeze/thaw anywhere can be postponed
> pending further discussion and testing.
> 
> In the meantime I'll come up with a patch for new-kernel-pkg to do a
> sync() unless someone else beats me to it.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux