On Thu, Oct 08, 2020 at 11:40:52AM +0200, Marco Elver wrote: > On Thu, 1 Oct 2020 at 19:58, Mark Rutland <mark.rutland@xxxxxxx> wrote: > [...] > > > > If you need virt_to_page() to work, the address has to be part of the > > > > linear/direct map. > [...] > > > > What's the underlying requirement here? Is this a performance concern, > > codegen/codesize, or something else? > > It used to be performance, since is_kfence_address() is used in the > fast path. However, with some further tweaks we just did to > is_kfence_address(), our benchmarks show a pointer load can be > tolerated. Great! I reckon that this is something we can optimize in futue if necessary (e.g. with some form of code-patching for immediate values), but it's good to have a starting point that works everywhere! [...] > > I'm not too worried about allocating this dynamically, but: > > > > * The arch code needs to set up the translation tables for this, as we > > cannot safely change the mapping granularity live. > > > > * As above I'm fairly certain x86 needs to use a carevout from the > > linear map to function correctly anyhow, so we should follow the same > > approach for both arm64 and x86. That might be a static carevout that > > we figure out the aliasing for, or something entirely dynamic. > > We're going with dynamically allocating the pool (for both x86 and > arm64), since any benefits we used to measure from the static pool are > no longer measurable (after removing a branch from > is_kfence_address()). It should hopefully simplify a lot of things, > given all the caveats that you pointed out. > > For arm64, the only thing left then is to fix up the case if the > linear map is not forced to page granularity. The simplest way to do this is to modify arm64's arch_add_memory() to force the entire linear map to be mapped at page granularity when KFENCE is enabled, something like: | diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c | index 936c4762dadff..f6eba0642a4a3 100644 | --- a/arch/arm64/mm/mmu.c | +++ b/arch/arm64/mm/mmu.c | @@ -1454,7 +1454,8 @@ int arch_add_memory(int nid, u64 start, u64 size, | { | int ret, flags = 0; | | - if (rodata_full || debug_pagealloc_enabled()) | + if (rodata_full || debug_pagealloc_enabled() || | + IS_ENABLED(CONFIG_KFENCE)) | flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; | | __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start), ... and I given that RODATA_FULL_DEFAULT_ENABLED is the default, I suspect it's not worth trying to only for that for the KFENCE region unless someone complains. Thanks, Mark.