On Fri, Oct 02, 2020 at 04:22:59PM +0200, Dmitry Vyukov wrote: > On Fri, Oct 2, 2020 at 9:54 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > On Fri, Oct 2, 2020 at 8:33 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > On Tue, Sep 29, 2020 at 3:38 PM Marco Elver <elver@xxxxxxxxxx> wrote: > > > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > > > > low-overhead sampling-based memory safety error detector of heap > > > > use-after-free, invalid-free, and out-of-bounds access errors. > > > > > > > > KFENCE is designed to be enabled in production kernels, and has near > > > > zero performance overhead. Compared to KASAN, KFENCE trades performance > > > > for precision. The main motivation behind KFENCE's design, is that with > > > > enough total uptime KFENCE will detect bugs in code paths not typically > > > > exercised by non-production test workloads. One way to quickly achieve a > > > > large enough total uptime is when the tool is deployed across a large > > > > fleet of machines. > > [...] > > > > +/* > > > > + * The pool of pages used for guard pages and objects. If supported, allocated > > > > + * statically, so that is_kfence_address() avoids a pointer load, and simply > > > > + * compares against a constant address. Assume that if KFENCE is compiled into > > > > + * the kernel, it is usually enabled, and the space is to be allocated one way > > > > + * or another. > > > > + */ > KFENCE needs the range to be covered by struct page's and that's what > creates problems for arm64. But I would assume most other users don't > need that. I've said this in a few other sub-threads, but the issue being attributed to arm64 is a red herring, and indicates a more fundamental issue that also applies to x86, which will introduce a regression for existing correctly-written code. I don't think that's acceptable for a feature expected to be deployed in production kernels, especially given that the failures are going to be non-deterministic and hard to debug. The code in question is mostly going to be in drivers, and it's very likely you may not hit it in local testing. If it is critical to avoid a pointer load here, then we need to either: * Build some infrastructure for patching constants. The x86 static_call work is vaguely the right shape for this. Then we can place the KFENCE region anywhere (e.g. within the linear/direct map), and potentially dynamically allocate it. * Go audit usage of {page,phys}_to_virt() to find any va->{page,pa}->va round-trips, and go modify that code to do something else which avoids a round-trip. When I last looked at this it didn't seem viable in general since in many cases the physcial address was the only piece of information which was retained. I'd be really curious to see how using an immediate compares to loading an __ro_after_init pointer value. Thanks, Mark.