On Tue, Mar 1, 2022 at 3:38 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > Summary---------- > Most all Fedora variants (except Cloud) have a GRUB menu entry > containing the word "rescue". This kernel+initramfs pair are never > updated for the life of a Fedora installation. And they quickly become > stale as a Fedora installation ages. This kernel's modules are > eventually deleted, and if selected at boot time, the typical user > experience is a dracut shell. > > Basic background------------- > (skip this section if you know how it works) > During a new installation, a single kernel version is installed. e.g. > > vmlinuz-5.17.0-0.rc4.96.fc36.x86_64 which is then duplicated as e.g. > vmlinuz-0-rescue-3a86878de5d649a983916543ece7bb7e. > > Each of those (identical) kernels has an initramfs file: > > initramfs-5.17.0-0.rc4.96.fc36.x86_64.img > initramfs-0-rescue-3a86878de5d649a983916543ece7bb7e.img > > The sole difference is the first one is a smaller host-only initramfs, > the second one is a larger no host-only initramfs created with `dracut > -N`. The bigger one just contains a bunch of extra kernel modules and > dracut scripts, ostensibly to make it more likely to boot a system > with some change in hardware that the host-only initramfs doesn't > contain. The size of this rescue initramfs is around 100 MiB, with the > common day to day "host only" initramfs being around 33 MiB. [1] > > As the system is updated, additional kernel versions are installed. > dnf.conf contains installonly_limit=3, which results in a maximum of > three kernel versions being installed at a time. Once a fourth kernel > is installed, the first kernel and its modules are removed from > /usr/lib/modules. The rescue kernel+initramfs pair are never updated > or upgraded, even during system upgrades. > > Observations------------ > This has been discussed by the Workstation working group [2] but since > this functionality is present in all of Fedora, we're moving the > discussion for greater visibility. > > There's two separate complaints, if you will: (a) that the > kernel+initramfs pair are never update or upgraded for the life of the > installation; and (b) that even during one release cycle, the user > experience when booting the rescue entry, changes, i.e. when the > matching /usr/lib/modules for the rescue entry are present early on, > you do get a full runtime behavior, you will get to a graphical > environment. But then once the version matched /usr/lib/modules are > removed, you get a completely different behavior when booting the > rescue entry. > > An important note from that ticket from Justin Forbes, the Fedora > kernel maintainer: " Remember, the only real purpose of the rescue > kernel is to get your system out of something completely unusable. It > isn't meant to be a full runtime." > > > Questions------------ > > * Considering the very narrow purpose of the entry, maybe the current > behavior is adequate? > > * Does the rescue entry reliably get users to a dracut prompt, rather > than indefinite hang? I don't know whether it does. I am surprised that the rescue kernel would give an indefinite hang or even just a dracut prompt within a release. I understand that people who constantly upgrade may have a very old rescue kernel, which doesn't natively support the things that current installs do and could have issues, but you should be able to reliably boot to terminal with network support from the rescue. That starts to fall apart if you did things like install a system on a very old release, and after a few years of upgrades still have the same rescue kernel, and in the meantime added new hardware, or converted a filesystem to btrfs, which wasn't built in before it was a default for some editions. It is my opinion that the purpose of the rescue kernel is to get you into console access with the network so that you can fix whatever issue made you boot the rescue to begin with, no more, no less. > * Is there any way to improve the situation without increasing the > risk that the rescue entry becomes totally non-functional? > * The chosen kernel version needs to be based on one that is known > to boot. Currently we know the kernel+initramfs pair work because it's > the same version used to boot the installation media when doing the > initial provisioning. We don't actually know an updated replacement > "no host-only" initramfs will work until it's tried. Is it possible to > automate this? And is it worth the risk, or even figuring out how to > assess the risk? This gets a bit tricky. Doing an install rescue is safe because either it works, or the system you just installed doesn't work either. When you do have a working rescue though, replacing it with a new kernel, even one that you have successfully booted, is not guaranteed to be safer. What if networking doesn't work in certain circumstances, or any number of issues that create problems but still "boot". I tend to hand create a new rescue when I add new hardware which might require it, but that is about all. One system here has a rescue kernel from 2016. This system has one from about a year ago because I replaced hardware and generated a new one. The original OS image was installed at F27 or so I think. Justin > * At Flock 2021, Zbyszek proposed "Building Initrd Images from > RPMs" to reduce the complexity of building initramfs, maybe there's a > role for it here? More: https://www.youtube.com/watch?v=GATg_bqmASc > > * What happens if we accept some scope creep, and go for many > improvements that make the extra work worth it? > * What about the unsigned nature of the initramfs? Should we be > creating initramfs's in Fedora infra and signing them? > * Stuff a graphical rescue environment into the initramfs? (This > might be ten leaps too far, but it's intended to encourage thinking > with a vivid imagination.) > > [1] both values from a recent Fedora 36 Workstation installation > > [2] https://pagure.io/fedora-workstation/issue/259 > > -- > Chris Murphy > _______________________________________________ > devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure