Re: [NEEDS-REVIEW] [PATCH] do_exit(): panic() when double fault detected

Jann Horn <jannh@xxxxxxxxxx> · Sun, 6 Dec 2020 23:05:14 +0100

On Sun, Dec 6, 2020 at 4:37 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> On 12/6/20 5:10 AM, Vladimir Kondratiev wrote:
> > Double fault detected in do_exit() is symptom of integrity
> > compromised. For safety critical systems, it may be better to
> > panic() in this case to minimize risk.
>
> Does this fix a real problem that you have observed in practice?
>
> Or, is this a general "hardening" which you think is a good practice?
>
> What does this have to do specifically with safety critical systems?
>
> The kernel generally tries to fix things up and keep running whenever
> possible, if for no other reason than it helps debug problems.  If that
> is an undesirable property for your systems, then I think you have a
> much bigger problem than faults during exit().
>
> This option, "panic_on_double_fault", doesn't actually panic on all
> double-faults, which means to me that it's dangerously named.

I wonder whether part of the idea here is that normally, when the
kernel fixes up a kernel crash by killing the offending task, a
service management process in userspace (e.g. the init daemon) can
potentially detect this case because it looks as if the task died with
SIGBUS or something. (I don't think that actually always works in
practice though, since AFAICS kernel crashes only lead to the *task*
being killed, not the entire process, and I think killing a single
worker thread of a multithreaded process might just cause the rest of
the userspace process to lock up. Not sure whether that's intentional
or something that should ideally be changed.)

But if the kernel gives up on going through with do_exit() (because it
crashed in do_exit() before being able to mark the task as waitable),
the process may, to userspace, appear to still be alive even though
it's not actually doing anything anymore; and if the kernel doesn't
tell userspace that the process is no longer functional, userspace
can't restore the system to a working state.

But as Dave said, this best-effort fixup is probably not the kind of
behavior you'd want in a "safety critical" system anyway; for example,
often the offending thread will have held some critical spinlock or
mutex or something, and then the rest of the system piles on into a
gigantic deadlock involving the lock in question and possibly multiple
locks that nest around it. You might be better off enabling
panic_on_oops, ideally with something like pstore-based logging of the
crash, and then quickly bringing everything back to a clean state
instead of continuing from an unstable state and then possibly
blocking somewhere.