Re: Need help with a weird kernel update panic.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You have to hit the timing right.  ie install the kernel package and
as quickly as possible reboot (automated, or very efficient).

And if the update is more than just kernel, that may slowdown the
process enough that the immediate reboot won't be quick enough.

I have seen it 3-5 times and that is over a huge number of machines,
in those the machines were booted and failed multiple times before
someone livecd booted it ran fsck'ed and/or mounted /boot, and it
found the files after that.

On Fri, May 8, 2020 at 2:00 PM Mauricio Tavares <raubvogel@xxxxxxxxx> wrote:
>
> On Fri, May 8, 2020 at 12:12 PM Roger Heflin <rogerheflin@xxxxxxxxx> wrote:
> >
> > A sync will flush the writes to the journal were the data is safe.  It
> > will not force a replay of the journal.
> >
> > Nothing except removing the journal from the ext4 filesystem will fix it.
> >
> > This is not a fedora bug, this is a long standing
> > kernel/grub/filesystem interaction bug (all who use a journaled
> > filesystem have this bug).
> >
> > See tune2fs and something like -O ^has_journal will turn off the
> > journal.  It has to be done unmounted and verify that your fstab entry
> > will remounted it.
> >
> > Check /proc/mounts having data=XXX (probably ordered) says you have a
> > journal, after the umount+above tune2fs+remount the data=ordered will
> > be gone.
> >
>       Interesting. I have a box which have been running for years with
>
> /dev/sdb1 /boot ext4
> rw,seclabel,noatime,barrier=1,stripe=32,data=ordered,discard 0 0
>
> and so far never borked on me.
>
> > On Fri, May 8, 2020 at 10:53 AM John Mellor <john.mellor@xxxxxxxxx> wrote:
> > >
> > > Interesting!  This machine does reboot in about 5secs and the other
> > > machines take longer, so it makes sense.  My /boot is mounted just like
> > > /home and / as follows:
> > >
> > >     /dev/sda1 on /boot type ext4 (rw,relatime,seclabel)
> > >
> > > I assume that a symple sync would flush the journal.  Its pretty easy to
> > > do a sync;sync if updating using the CLI, but not possible when using
> > > the GUI.  Is this a Fedora bug where the journal is not correctly
> > > flushed on the reboot?  Should I modify that mount entry or do achattr
> > > change to workaround the bug?
> > >
> > >
> > > On 2020-05-08 11:11 a.m., Roger Heflin wrote:
> > > > What you are saying does not exactly match what I have previously
> > > > seen, but there is a known feature with using a journaling filesystem
> > > > (ext4-journal, or xfs) for /boot, if only the journal is updated and
> > > > if it is not yet replayed  into the non-journal then grub will not be
> > > > able to find the new files/updated files (grub filesystem code is
> > > > simple and does not process the journal so if critical updates are
> > > > still in the journal then those updates(changed file, new files)
> > > > cannot be seen).  To get this one generally has to do the update and
> > > > almost immediately reboot (within a few minutes though in some cases,
> > > > note syncing the does not replay the journal).   The fix is to boot up
> > > > with a kernel that it can still find and/or livecd and mount /boot so
> > > > that the journal gets replayed, or fsck boot so that the journal gets
> > > > replayed.
> > > >
> > > > Long term the solution is to move boot to a non-journaled fs (ext
> > > > without a journal) or after each update umount/mount /boot(before
> > > > reboot)..  If /boot is not separated then you cannot umount/mount it
> > > > to get the journal to replay.  There is a second method to force a
> > > > journal replay, but reports say that one often "hangs" when /boot is
> > > > not separate so is not a reliable solution.    There were some
> > > > detailed posts on this several years ago with reliable commenters
> > > > confirming the behavior.  I have also personally seen the issue a
> > > > number of times and mount /boot and/or fscking corrects it (replays
> > > > journal).
> > > >
> > > > On Fri, May 8, 2020 at 8:52 AM John Mellor <john.mellor@xxxxxxxxx> wrote:
> > > >> I have one completely stock workstation F32 machine where kernel updates
> > > >> almost always cause a multiple-reboot panic problem.  This problem also
> > > >> occurred on F31, but not on releases before that. I'm stumped and need
> > > >> some help in figuring it out.
> > > >>
> > > >> The symptoms vary in the number of reboots and the type of tertiary
> > > >> error, but are otherwise pretty similar.  It does not matter whether I
> > > >> use the Gnome update app or the CLI dnf method. After a number of
> > > >> reboots, the upgrade succeeds and Fedora behaves nortmally again.  I
> > > >> think that this only happens whenever the kernel is upgraded.
> > > >>
> > > >> What I observe is that the machine is rebooted and on reboot, grub (I
> > > >> think) gets a halt for a 32-bit relocation error.  This sequence may
> > > >> happen twice.  Its an i7 with plenty of memory and an SSD boot disk, so
> > > >> the 32-bit thing is confusing.  To get around this error, I powercycle
> > > >> the box and get into the next stage of the problem.  One the 2nd or 3rd
> > > >> reboot, I usually see a halt with an access outside of the kernel space,
> > > >> although with the update this morning, I had a kernel panic instead.
> > > >> Cold-booting again, and the update is installed, and the last reboot and
> > > >> I'm up on the new updates.
> > > >>
> > > >> After that, the machine behaves normally until the next kernel updates.
> > > >> I assume that there is some incorrectly-asynchronous operation in grub
> > > >> related to the update entry, but I can find no grub logs to dig into
> > > >> this problem.  I have several other machines that do not see this
> > > >> problem.  I dug around in the fedora bugs, but not knowing what to look
> > > >> for, I'm basically blind.  Its a pretty serious bug, especially if the
> > > >> machine is remote.  Does anyone have a way out of this?
> > > >>
> > > >> --
> > > >>
> > > >> John Mellor
> > > >> _______________________________________________
> > > >> users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> > > >> To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> > > >> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > > >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > > >> List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> > > > _______________________________________________
> > > > users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> > > > To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> > > > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > > > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > > > List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> > > _______________________________________________
> > > users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> > > To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> > > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > > List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> > _______________________________________________
> > users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> > To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> _______________________________________________
> users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux