On Fri, 30 Oct 2020 at 03:49, Jann Horn <jannh@xxxxxxxxxx> wrote: > On Thu, Oct 29, 2020 at 2:17 PM Marco Elver <elver@xxxxxxxxxx> wrote: > > Add architecture specific implementation details for KFENCE and enable > > KFENCE for the x86 architecture. In particular, this implements the > > required interface in <asm/kfence.h> for setting up the pool and > > providing helper functions for protecting and unprotecting pages. > > > > For x86, we need to ensure that the pool uses 4K pages, which is done > > using the set_memory_4k() helper function. > > > > Reviewed-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> > > Co-developed-by: Marco Elver <elver@xxxxxxxxxx> > > Signed-off-by: Marco Elver <elver@xxxxxxxxxx> > > Signed-off-by: Alexander Potapenko <glider@xxxxxxxxxx> > [...] > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > [...] > > @@ -725,6 +726,9 @@ no_context(struct pt_regs *regs, unsigned long error_code, > > if (IS_ENABLED(CONFIG_EFI)) > > efi_recover_from_page_fault(address); > > > > + if (kfence_handle_page_fault(address)) > > + return; > > We can also get to this point due to an attempt to execute a data > page. That's very unlikely (given that the same thing would also crash > if you tried to do it with normal heap memory, and KFENCE allocations > are extremely rare); but we might want to try to avoid handling such > faults as KFENCE faults, since KFENCE will assume that it has resolved > the fault and retry execution of the faulting instruction. Once kernel > protection keys are introduced, those might cause the same kind of > trouble. > > So we might want to gate this on a check like "if ((error_code & > X86_PF_PROT) == 0)" (meaning "only handle the fault if the fault was > caused by no page being present", see enum x86_pf_error_code). Good point. Will fix in v7. > Unrelated sidenote: Since we're hooking after exception fixup > handling, the debug-only KFENCE_STRESS_TEST_FAULTS can probably still > cause some behavioral differences through spurious faults in places > like copy_user_enhanced_fast_string (where the exception table entries > are used even if the *kernel* pointer, not the user pointer, causes a > fault). But since KFENCE_STRESS_TEST_FAULTS is exclusively for KFENCE > development, the difference might not matter. And ordering them the > other way around definitely isn't possible, because the kernel relies > on being able to fixup OOB reads. So there probably isn't really > anything we can do better here; it's just something to keep in mind. > Maybe you can add a little warning to the help text for that Kconfig > entry that warns people about this? Thanks for pointing it out, but that option really is *only* to stress kfence with concurrent allocations/frees/page faults. If anybody enables this option for anything other than testing kfence, it's their own fault. ;-) I'll try to add a generic note to the Kconfig entry, but what you mention here seems quite x86-specific. Thanks, -- Marco