On Fri, Dec 08 2023 at 15:11, Jann Horn wrote: > On Tue, Nov 21, 2023 at 6:13 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> > BUG: unable to handle page fault for address: ffffffffff600000 >> >> This is VSYSCALL_ADDR. >> >> So the real question is why the BPF program tries to copy from the >> VSYSCALL page, which is not mapped. > > The linked syz repro is: > > r0 = bpf$PROG_LOAD(0x5, &(0x7f00000000c0)={0x11, 0xb, > &(0x7f0000000180)=@framed={{}, [@printk={@integer, {}, {}, {}, {}, > {0x7, 0x0, 0xb, 0x3, 0x0, 0x0, 0xff600000}, {0x85, 0x0, 0x0, 0x71}}]}, > &(0x7f0000000200)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, > 0x90) > bpf$BPF_RAW_TRACEPOINT_OPEN(0x11, > &(0x7f0000000540)={&(0x7f0000000000)='kfree\x00', r0}, 0x10) > > So syzkaller generated a BPF tracing program. 0x85 is BPF_JMP | > BPF_CALL, which is used to invoke BPF helpers; 0x71 is 113, which is > the number of the probe_read_kernel helper, which basically takes > arbitrary values as input and casts them to kernel pointers, and then > probe-reads them. And before that is some kinda ALU op with 0xff600000 > as immediate. > > So it looks like the answer to that question is "the BPF program tries > to copy from the VSYSCALL page because syzkaller decided to write BPF > code that does specifically that, and the BPF helper let it do that". Indeed. > copy_from_kernel_nofault() does check > copy_from_kernel_nofault_allowed() to make sure the pointer really is > a kernel pointer, and the X86 version of that rejects anything in the > userspace part of the address space. But it does not know about the > vsyscall area. That's cureable. Untested fix below. Thanks for the explanation! tglx --- diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c index 6993f026adec..8e846833aa37 100644 --- a/arch/x86/mm/maccess.c +++ b/arch/x86/mm/maccess.c @@ -3,6 +3,8 @@ #include <linux/uaccess.h> #include <linux/kernel.h> +#include <uapi/asm/vsyscall.h> + #ifdef CONFIG_X86_64 bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size) { @@ -15,6 +17,9 @@ bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size) if (vaddr < TASK_SIZE_MAX + PAGE_SIZE) return false; + if ((vaddr & PAGE_MASK) == VSYSCALL_ADDR) + return false; + /* * Allow everything during early boot before 'x86_virt_bits' * is initialized. Needed for instruction decoding in early