Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes: > Le 15/10/2024 à 03:33, Ritesh Harjani (IBM) a écrit : >> copy_from_kernel_nofault() can be called when doing read of /proc/kcore. >> /proc/kcore can have some unmapped kfence objects which when read via >> copy_from_kernel_nofault() can cause page faults. Since *_nofault() >> functions define their own fixup table for handling fault, use that >> instead of asking kfence to handle such faults. >> >> Hence we search the exception tables for the nip which generated the >> fault. If there is an entry then we let the fixup table handler handle the >> page fault by returning an error from within ___do_page_fault(). >> >> This can be easily triggered if someone tries to do dd from /proc/kcore. >> dd if=/proc/kcore of=/dev/null bs=1M >> >> <some example false negatives> >> =============================== >> BUG: KFENCE: invalid read in copy_from_kernel_nofault+0xb0/0x1c8 >> Invalid read at 0x000000004f749d2e: >> copy_from_kernel_nofault+0xb0/0x1c8 >> 0xc0000000057f7950 >> read_kcore_iter+0x41c/0x9ac >> proc_reg_read_iter+0xe4/0x16c >> vfs_read+0x2e4/0x3b0 >> ksys_read+0x88/0x154 >> system_call_exception+0x124/0x340 >> system_call_common+0x160/0x2c4 >> >> BUG: KFENCE: use-after-free read in copy_from_kernel_nofault+0xb0/0x1c8 >> Use-after-free read at 0x000000008fbb08ad (in kfence-#0): >> copy_from_kernel_nofault+0xb0/0x1c8 >> 0xc0000000057f7950 >> read_kcore_iter+0x41c/0x9ac >> proc_reg_read_iter+0xe4/0x16c >> vfs_read+0x2e4/0x3b0 >> ksys_read+0x88/0x154 >> system_call_exception+0x124/0x340 >> system_call_common+0x160/0x2c4 >> >> Guessing the fix should go back to when we first got kfence on PPC32. >> >> Fixes: 90cbac0e995d ("powerpc: Enable KFENCE for PPC32") >> Reported-by: Disha Goel <disgoel@xxxxxxxxxxxxx> >> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx> >> --- >> arch/powerpc/mm/fault.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c >> index 81c77ddce2e3..fa825198f29f 100644 >> --- a/arch/powerpc/mm/fault.c >> +++ b/arch/powerpc/mm/fault.c >> @@ -439,9 +439,17 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address, >> /* >> * The kernel should never take an execute fault nor should it >> * take a page fault to a kernel address or a page fault to a user >> - * address outside of dedicated places >> + * address outside of dedicated places. >> + * >> + * Rather than kfence reporting false negatives, let the fixup table >> + * handler handle the page fault by returning SIGSEGV, if the fault >> + * has come from functions like copy_from_kernel_nofault(). >> */ >> if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, is_write))) { >> + >> + if (search_exception_tables(instruction_pointer(regs))) >> + return SIGSEGV; > > This is a heavy operation. It should at least be done only when KFENCE > is built-in. > > kfence_handle_page_fault() bails out immediately when > is_kfence_address() returns false, and is_kfence_address() returns > always false when KFENCE is not built-in. > > So you could check that before calling the heavy weight > search_exception_tables(). > > if (is_kfence_address(address) && > !search_exception_tables(instruction_pointer(regs)) && > kfence_handle_page_fault(address, is_write, regs)) > return 0; > Yes, thanks for the input. I agree with above. I will take that in v3. I will wait for sometime for any review comments on other patches before spinning a v3, though. > > > > + return SIGSEGV; > >> + >> if (kfence_handle_page_fault(address, is_write, regs)) >> return 0; >> -ritesh