[RFC V2 PATCH 0/1] kexec: crash_kexec_post_notifiers boot option related fixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hidehiro Kawai <hidehiro.kawai.ez at hitachi.com> writes:

> Hello Eric and Vivek,
>
> Do you have any comments?

crash_kexec_post_notifiers is a debugging hack to allow people to test
if the kmsg_dump works better than kexec.  crash_kexec_post_notifiers is
not, nor has it ever been a solution for general operation (which is
what I perceive this work trying to push).

I will not support any work that expands crash_kexec_post_notifiers to
be more than it currently is, because people want ``panic hooks'' to
run before kexec.  That appropach was extensively tested before kexec on
panic was implemented in the kernel and every implementation failed.
The practical symptom was that everything would work ok in testing but
on failures in the real world there would be enough going on in the
dying kernel that no crash dump would be taken.  kexec on panic on the
other hand works a reasonable fraction of the time.

I deeply and fundamentally can not support a general purpose hook being
called before kexec.  In 15 years of practice I have never heard of a
case where using a general purpose hook does anything but make kexec on
panic undebuggable in practice.

A specific hook for a very specific purpose when there is no other way
we can consider.

If you don't have something that generalises well into a general purpose
operation that it makes sense for everyone to call you can always use
the world's largest aka you can run code before the new kernel starts
that is loaded with kexec_load.

If you absolutely must run code in the dying code because you need lots
of the kernel infrastructure to work, and it is too hard to code a small
little bit of stand-alone assembly, I am sorry for you.  Experience shows
that will never work when the kernel fails in interesting ways.

So no.  I don't think there is any point to putting any more effort into
the crash_kexec_post_notifiers path because experience has shown over
the years that in practice it won't work for anyone, and if the code
doesn't work in practice there is no point in developing or implementing
it.

Eric



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux