On Tue, 2020-09-15 at 15:20 +0200, Marco Elver wrote: > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > low-overhead sampling-based memory safety error detector of heap > use-after-free, invalid-free, and out-of-bounds access errors. This > series enables KFENCE for the x86 and arm64 architectures, and adds > KFENCE hooks to the SLAB and SLUB allocators. > > KFENCE is designed to be enabled in production kernels, and has near > zero performance overhead. Compared to KASAN, KFENCE trades performance > for precision. The main motivation behind KFENCE's design, is that with > enough total uptime KFENCE will detect bugs in code paths not typically > exercised by non-production test workloads. One way to quickly achieve a > large enough total uptime is when the tool is deployed across a large > fleet of machines. > > KFENCE objects each reside on a dedicated page, at either the left or > right page boundaries. The pages to the left and right of the object > page are "guard pages", whose attributes are changed to a protected > state, and cause page faults on any attempted access to them. Such page > faults are then intercepted by KFENCE, which handles the fault > gracefully by reporting a memory access error. > > Guarded allocations are set up based on a sample interval (can be set > via kfence.sample_interval). After expiration of the sample interval, > the next allocation through the main allocator (SLAB or SLUB) returns a > guarded allocation from the KFENCE object pool. At this point, the timer > is reset, and the next allocation is set up after the expiration of the > interval. > > To enable/disable a KFENCE allocation through the main allocator's > fast-path without overhead, KFENCE relies on static branches via the > static keys infrastructure. The static branch is toggled to redirect the > allocation to KFENCE. > > The KFENCE memory pool is of fixed size, and if the pool is exhausted no > further KFENCE allocations occur. The default config is conservative > with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB > pages). > > We have verified by running synthetic benchmarks (sysbench I/O, > hackbench) that a kernel with KFENCE is performance-neutral compared to > a non-KFENCE baseline kernel. > > KFENCE is inspired by GWP-ASan [1], a userspace tool with similar > properties. The name "KFENCE" is a homage to the Electric Fence Malloc > Debugger [2]. > > For more details, see Documentation/dev-tools/kfence.rst added in the > series -- also viewable here: Does anybody else grow tried of all those different *imperfect* versions of in- kernel memory safety error detectors? KASAN-generic, KFENCE, KASAN-tag-based etc. Then, we have old things like page_poison, SLUB debugging, debug_pagealloc etc which are pretty much inefficient to detect bugs those days compared to KASAN. Can't we work towards having a single implementation and clean up all those mess?