Re: Fedora 27 kernel updates make system unbootable (sort of)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 20, 2018 at 3:32 PM, Lennart Poettering
<mzerqung@xxxxxxxxxxx> wrote:
> On Fr, 20.04.18 12:20, Chris Murphy (lists@xxxxxxxxxxxxxxxxx) wrote:
>
>> I'm honestly mystified why the plymouth commit hasn't been reverted in
>> the interim. But I'm also mystified why the bootloader folks don't
>> give a shit to commit their configuration files to disk when they know
>> they can't do journal replay and have known that for 20 years. But
>> then I'm also mystified why systemd developers won't fallback to
>> freeze/thaw if rootfs remount-ro fails three times, instead of just
>> giving up and forcing a reboot, they did discuss doing this a year ago
>> and then poof, no action.
>
> Quite frankly, if you want to put the blame somewhere, I'd probably
> place it with the xfs folks? I mean, there's a well-defined API on
> linux for syncing a file system to disk so that it is in a clean
> state, it's called sync(). Turns out that doesn't work though, it
> doesn't actually do that.

OK after a bunch of additional testing, one thing is certain; grubby
and grub-mkconfig are not even remotely crash safe. They do not
fsync() or sync() at all. That's not good no matter the file system.

The best near term change that I think we should make is putting a
sync() somewhere in grubby. I've tested modifying new-kernel-pkg to
add a sync right before the last line, and

# dnf update -y /path/to/kernelrpms && echo b > /proc/sysrq-trigger

With very limited testing (dozens of reboots in multiple fs
configurations of a VM only, using unsafe caching), all configurations
including XFS are better off with sync() added. It a highly
non-deterministic mess to sort out without sync().

But with sync() and with /boot on XFS: the kernel, initramfs, or
grub.cfg - there is an unknown (let's say 50/50) probability the
bootloader can't find or read those files despite the sync(). The most
common outcome I ran into was a partially modified grub.cfg containing
an incomplete default entry for the new kernel which is missing the
grubby insertion for the initrd. So it can't boot. But if the grub
menu appears at all, the previous kernel and rescue entries are still
there, and they work. Log replay during boot fixes up everything and
now the next boot works fine by default without user interaction. So
even though sync() is still not good enough for XFS always, it's way
better than no sync().

Meanwhile FAT, ext2, ext4, and Btrfs all appear to flush to disk with
sync() sufficient for the bootloader to have no problems.

Note that fsync() is actually insufficient at least some of the time
on all journaled file systems. e.g. dracut -f only does fsync() on the
copied over initramfs to /boot and that new initramfs is often not
seen by the bootloader for journaled file systems. Instead, the
bootloader sees the old initramfs that had been deleted, which during
boot (if successful) replays the log and completes the delete of old
initramfs, and now the new one appears and isused at next boot. Other
combinations are possible.

Minor in comparison, I see grubby doing a metric ton of fdatasync() on
/var/log/grubby -  80 fdatasync() 's for 17 lines and 1.2KiB of
changes for a single kernel installation. Does that make sense to
anyone? How about one fdatasync() for all of those changes? Or just
fsync() the file?

I think any plan for doing freeze/thaw anywhere can be postponed
pending further discussion and testing.

In the meantime I'll come up with a patch for new-kernel-pkg to do a
sync() unless someone else beats me to it.



-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux