On Wed, 7 Oct 2020 at 16:15, Jann Horn <jannh@xxxxxxxxxx> wrote: > > On Wed, Oct 7, 2020 at 3:09 PM Marco Elver <elver@xxxxxxxxxx> wrote: > > On Fri, 2 Oct 2020 at 07:45, Jann Horn <jannh@xxxxxxxxxx> wrote: > > > On Tue, Sep 29, 2020 at 3:38 PM Marco Elver <elver@xxxxxxxxxx> wrote: > > > > Add architecture specific implementation details for KFENCE and enable > > > > KFENCE for the x86 architecture. In particular, this implements the > > > > required interface in <asm/kfence.h> for setting up the pool and > > > > providing helper functions for protecting and unprotecting pages. > > > > > > > > For x86, we need to ensure that the pool uses 4K pages, which is done > > > > using the set_memory_4k() helper function. > > > [...] > > > > diff --git a/arch/x86/include/asm/kfence.h b/arch/x86/include/asm/kfence.h > > > [...] > > > > +/* Protect the given page and flush TLBs. */ > > > > +static inline bool kfence_protect_page(unsigned long addr, bool protect) > > > > +{ > > > > + unsigned int level; > > > > + pte_t *pte = lookup_address(addr, &level); > > > > + > > > > + if (!pte || level != PG_LEVEL_4K) > > > > > > Do we actually expect this to happen, or is this just a "robustness" > > > check? If we don't expect this to happen, there should be a WARN_ON() > > > around the condition. > > > > It's not obvious here, but we already have this covered with a WARN: > > the core.c code has a KFENCE_WARN_ON, which disables KFENCE on a > > warning. > > So for this specific branch: Can it ever happen? If not, please either > remove it or add WARN_ON(). That serves two functions: It ensures that > if something unexpected happens, we see a warning, and it hints to > people reading the code "this isn't actually expected to happen, you > don't have to wrack your brain trying to figure out for which scenario > this branch is intended". Perhaps I could have been clearer: we already have this returning false covered by a WARN+disable KFENCE in core.c. We'll add another WARN_ON right here, as it doesn't hurt, and hopefully improves readability. > > > > + return false; > > > > + > > > > + if (protect) > > > > + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT)); > > > > + else > > > > + set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT)); > > > > > > Hmm... do we have this helper (instead of using the existing helpers > > > for modifying memory permissions) to work around the allocation out of > > > the data section? > > > > I just played around with using the set_memory.c functions, to remind > > myself why this didn't work. I experimented with using > > set_memory_{np,p}() functions; set_memory_p() isn't implemented, but > > is easily added (which I did for below experiment). However, this > > didn't quite work: > [...] > > For one, smp_call_function_many_cond() doesn't want to be called with > > interrupts disabled, and we may very well get a KFENCE allocation or > > page fault with interrupts disabled / within interrupts. > > > > Therefore, to be safe, we should avoid IPIs. > > set_direct_map_invalid_noflush() does that, too, I think? And that's > already implemented for both arm64 and x86. Sure, that works. We still want the flush_tlb_one_kernel(), at least so the local CPU's TLB is flushed. > > It follows that setting > > the page attribute is best-effort, and we can tolerate some > > inaccuracy. Lazy fault handling should take care of faults after we > > set the page as PRESENT. > [...] > > > Shouldn't kfence_handle_page_fault() happen after prefetch handling, > > > at least? Maybe directly above the "oops" label? > > > > Good question. AFAIK it doesn't matter, as is_kfence_address() should > > never apply for any of those that follow, right? In any case, it > > shouldn't hurt to move it down. > > is_prefetch() ignores any #PF not caused by instruction fetch if it > comes from kernel mode and the faulting instruction is one of the > PREFETCH* instructions. (Which is not supposed to happen - the > processor should just be ignoring the fault for PREFETCH instead of > generating an exception AFAIK. But the comments say that this is about > CPU bugs and stuff.) While this is probably not a big deal anymore > partly because the kernel doesn't use software prefetching in many > places anymore, it seems to me like, in principle, this could also > cause page faults that should be ignored in KFENCE regions if someone > tries to do PREFETCH on an out-of-bounds array element or a dangling > pointer or something. Thanks for the clarification.