On Thu, 8 Oct 2020 at 12:45, Mark Rutland <mark.rutland@xxxxxxx> wrote: > On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote: > > On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > [...] > > > > > If you need virt_to_page() to work, the address has to be part of the > > > > > linear/direct map. > > [...] > > > > > > What's the underlying requirement here? Is this a performance concern, > > > codegen/codesize, or something else? > > > > It used to be performance, since is_kfence_address() is used in the > > fast path. However, with some further tweaks we just did to > > is_kfence_address(), our benchmarks show a pointer load can be > > tolerated. > > Great! > > I reckon that this is something we can optimize in futue if necessary > (e.g. with some form of code-patching for immediate values), but it's > good to have a starting point that works everywhere! > > [...] > > > > I'm not too worried about allocating this dynamically, but: > > > > > > * The arch code needs to set up the translation tables for this, as we > > > cannot safely change the mapping granularity live. > > > > > > * As above I'm fairly certain x86 needs to use a carevout from the > > > linear map to function correctly anyhow, so we should follow the same > > > approach for both arm64 and x86. That might be a static carevout that > > > we figure out the aliasing for, or something entirely dynamic. > > > > We're going with dynamically allocating the pool (for both x86 and > > arm64), since any benefits we used to measure from the static pool are > > no longer measurable (after removing a branch from > > is_kfence_address()). It should hopefully simplify a lot of things, > > given all the caveats that you pointed out. > > > > For arm64, the only thing left then is to fix up the case if the > > linear map is not forced to page granularity. > > The simplest way to do this is to modify arm64's arch_add_memory() to > force the entire linear map to be mapped at page granularity when KFENCE > is enabled, something like: > [...] > > ... and I given that RODATA_FULL_DEFAULT_ENABLED is the default, I > suspect it's not worth trying to only for that for the KFENCE region > unless someone complains. We've got most of this sorted now for v5 -- thank you! The only thing we're wondering now, is if there are any corner cases with using memblock_alloc'd memory for the KFENCE pool? (We'd like to avoid page alloc's MAX_ORDER limit.) We have a version that passes tests on x86 and arm64, but checking just in case. :-) Thanks, -- Marco