> Do you have a plan to backport this into upstream LTS kernels? As I understand, the answer is "hopefully yes" with the big presumption that all stakeholders are on board for the change. There is *definitely* a plan to *submit* backports to the stable trees, but ofc it will require some approvals. On Thu, Jan 19, 2023 at 3:10 PM SeongJae Park <sj@xxxxxxxxxx> wrote: > > Hello, > > On Thu, 17 Nov 2022 15:43:22 -0800 Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > From: Jann Horn <jannh@xxxxxxxxxx> > > > > Many Linux systems are configured to not panic on oops; but allowing an > > attacker to oops the system **really** often can make even bugs that look > > completely unexploitable exploitable (like NULL dereferences and such) if > > each crash elevates a refcount by one or a lock is taken in read mode, and > > this causes a counter to eventually overflow. > > > > The most interesting counters for this are 32 bits wide (like open-coded > > refcounts that don't use refcount_t). (The ldsem reader count on 32-bit > > platforms is just 16 bits, but probably nobody cares about 32-bit platforms > > that much nowadays.) > > > > So let's panic the system if the kernel is constantly oopsing. > > > > The speed of oopsing 2^32 times probably depends on several factors, like > > how long the stack trace is and which unwinder you're using; an empirically > > important one is whether your console is showing a graphical environment or > > a text console that oopses will be printed to. > > In a quick single-threaded benchmark, it looks like oopsing in a vfork() > > child with a very short stack trace only takes ~510 microseconds per run > > when a graphical console is active; but switching to a text console that > > oopses are printed to slows it down around 87x, to ~45 milliseconds per > > run. > > (Adding more threads makes this faster, but the actual oops printing > > happens under &die_lock on x86, so you can maybe speed this up by a factor > > of around 2 and then any further improvement gets eaten up by lock > > contention.) > > > > It looks like it would take around 8-12 days to overflow a 32-bit counter > > with repeated oopsing on a multi-core X86 system running a graphical > > environment; both me (in an X86 VM) and Seth (with a distro kernel on > > normal hardware in a standard configuration) got numbers in that ballpark. > > > > 12 days aren't *that* short on a desktop system, and you'd likely need much > > longer on a typical server system (assuming that people don't run graphical > > desktop environments on their servers), and this is a *very* noisy and > > violent approach to exploiting the kernel; and it also seems to take orders > > of magnitude longer on some machines, probably because stuff like EFI > > pstore will slow it down a ton if that's active. > > I found a blog article[1] recommending LTS kernels to backport this as below. > > While this patch is already upstream, it is important that distributed > kernels also inherit this oops limit and backport it to LTS releases if we > want to avoid treating such null-dereference bugs as full-fledged security > issues in the future. > > Do you have a plan to backport this into upstream LTS kernels? > > [1] https://googleprojectzero.blogspot.com/2023/01/exploiting-null-dereferences-in-linux.html > > > Thanks, > SJ > > > > > Signed-off-by: Jann Horn <jannh@xxxxxxxxxx> > > Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@xxxxxxxxxx > > Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>