Re: "rescue" boot entry files are not updated on OS upgrades

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 1, 2022 at 3:38 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
> Summary----------
> Most all Fedora variants (except Cloud) have a GRUB menu entry
> containing the word "rescue". This kernel+initramfs pair are never
> updated for the life of a Fedora installation. And they quickly become
> stale as a Fedora installation ages. This kernel's modules are
> eventually deleted, and if selected at boot time, the typical user
> experience is a dracut shell.
>
> Basic background-------------
> (skip this section if you know how it works)
> During a new installation, a single kernel version is installed. e.g.
>
> vmlinuz-5.17.0-0.rc4.96.fc36.x86_64 which is then duplicated as e.g.
> vmlinuz-0-rescue-3a86878de5d649a983916543ece7bb7e.
>
> Each of those (identical) kernels has an initramfs file:
>
> initramfs-5.17.0-0.rc4.96.fc36.x86_64.img
> initramfs-0-rescue-3a86878de5d649a983916543ece7bb7e.img
>
> The sole difference is the first one is a smaller host-only initramfs,
> the second one is a larger no host-only initramfs created with `dracut
> -N`. The bigger one just contains a bunch of extra kernel modules and
> dracut scripts, ostensibly to make it more likely to boot a system
> with some change in hardware that the host-only initramfs doesn't
> contain. The size of this rescue initramfs is around 100 MiB, with the
> common day to day "host only" initramfs being around 33 MiB. [1]
>
> As the system is updated, additional kernel versions are installed.
> dnf.conf contains installonly_limit=3, which results in a maximum of
> three kernel versions being installed at a time. Once a fourth kernel
> is installed, the first kernel and its modules are removed from
> /usr/lib/modules. The rescue kernel+initramfs pair are never updated
> or upgraded, even during system upgrades.
>
> Observations------------
> This has been discussed by the Workstation working group [2] but since
> this functionality is present in all of Fedora, we're moving the
> discussion for greater visibility.
>
> There's two separate complaints, if you will: (a) that the
> kernel+initramfs pair are never update or upgraded for the life of the
> installation; and (b) that even during one release cycle, the user
> experience when booting the rescue entry, changes, i.e. when the
> matching /usr/lib/modules for the rescue entry are present early on,
> you do get a full runtime behavior, you will get to a graphical
> environment. But then once the version matched /usr/lib/modules are
> removed, you get a completely different behavior when booting the
> rescue entry.
>
> An important note from that ticket from Justin Forbes, the Fedora
> kernel maintainer: " Remember, the only real purpose of the rescue
> kernel is to get your system out of something completely unusable. It
> isn't meant to be a full runtime."
>
>
> Questions------------
>
> * Considering the very narrow purpose of the entry, maybe the current
> behavior is adequate?
>
> * Does the rescue entry reliably get users to a dracut prompt, rather
> than indefinite hang? I don't know whether it does.

I am surprised that the rescue kernel would give an indefinite hang or
even just a dracut prompt within a release. I understand that people
who constantly upgrade may have a very old rescue kernel, which
doesn't natively support the things that current installs do and could
have issues, but you should be able to reliably boot to terminal with
network support from the rescue.  That starts to fall apart if you did
things like install a system on a very old release, and after a few
years of upgrades still have the same rescue kernel, and in the
meantime added new hardware, or converted a filesystem to btrfs, which
wasn't built in before it was a default for some editions.  It is my
opinion that the purpose of the rescue kernel is to get you into
console access with the network so that you can fix whatever issue
made you boot the rescue to begin with, no more, no less.

> * Is there any way to improve the situation without increasing the
> risk that the rescue entry becomes totally non-functional?
>    * The chosen kernel version needs to be based on one that is known
> to boot. Currently we know the kernel+initramfs pair work because it's
> the same version used to boot the installation media when doing the
> initial provisioning. We don't actually know an updated replacement
> "no host-only" initramfs will work until it's tried. Is it possible to
> automate this? And is it worth the risk, or even figuring out how to
> assess the risk?

This gets a bit tricky. Doing an install rescue is safe because either
it works, or the system you just installed doesn't work either.  When
you do have a working rescue though, replacing it with a new kernel,
even one that you have successfully booted, is not guaranteed to be
safer.  What if networking doesn't work in certain circumstances, or
any number of issues that create problems but still "boot".  I tend to
hand create a new rescue when I add new hardware which might require
it, but that is about all. One system here has a rescue kernel from
2016.  This system has one from about a year ago because I replaced
hardware and generated a new one.  The original OS image was installed
at F27 or so I think.

Justin

>    * At Flock 2021, Zbyszek proposed "Building Initrd Images from
> RPMs" to reduce the complexity of building initramfs, maybe there's a
> role for it here? More: https://www.youtube.com/watch?v=GATg_bqmASc
>
> * What happens if we accept some scope creep, and go for many
> improvements that make the extra work worth it?
>     * What about the unsigned nature of the initramfs? Should we be
> creating initramfs's in Fedora infra and signing them?
>     * Stuff a graphical rescue environment into the initramfs? (This
> might be ten leaps too far, but it's intended to encourage thinking
> with a vivid imagination.)
>
> [1] both values from a recent Fedora 36 Workstation installation
>
> [2] https://pagure.io/fedora-workstation/issue/259
>
> --
> Chris Murphy
> _______________________________________________
> devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux