Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter

Alexander Popov <alex.popov@xxxxxxxxx> · Tue, 16 Nov 2021 12:12:16 +0300

On 16.11.2021 01:06, Kees Cook wrote:
Hmm, yes. What it originally boiled down to, which is why Linus first
objected to BUG(), was that we don't know what other parts of the system
have been disrupted. The best example is just that of locking: if we
BUG() or do_exit() in the middle of holding a lock, we'll wreck whatever
subsystem that was attached to. Without a deterministic system state
unwinder, there really isn't a "safe" way to just stop a kernel thread.

With this pkill_on_warn, we avoid the BUG problem (since the thread of
execution continues and stops at an 'expected' place: the signal
handler).

However, now we have the newer objection from Linus, which is one of
attribution: the WARN might be hit during an "unrelated" thread of
execution and "current" gets blamed, etc. And beyond that, if we take
down a portion of userspace, what in userspace may be destabilized? In
theory, we get a case where any required daemons would be restarted by
init, but that's not "known".

The safest version of this I can think of is for processes to opt into
this mitigation. That would also cover the "special cases" we've seen
exposed too. i.e. init and kthreads would not opt in.

However, that's a lot to implement when Marco's tracing suggestion might
be sufficient and policy could be entirely implemented in userspace. It
could be as simple as this (totally untested):

I don't think that this userspace warning handling can work as pkill_on_warn.

1. The kernel code execution continues after WARN_ON(), it will not wait some 
userspace daemon that is polling trace events. That's not different from 
ignoring and having all negative effects after WARN_ON().

2. This userspace policy will miss WARN_ON_ONCE(), WARN_ONCE() and 
WARN_TAINT_ONCE() after the first hit.

Oh, wait...
I got a crazy idea that may bring more consistency in the error handling mess.

What if the Linux kernel had a LSM module responsible for error handling policy?
That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
In such LSM policy we can decide immediately how to react on the kernel error.
We can even decide depending on the subsystem and things like that.

(idea for brainstorming)

Best regards,
Alexander