On Tue, 13 Feb 2024 at 01:21, Yan Zhai <yan@xxxxxxxxxxxxxx> wrote: > > On Mon, Feb 12, 2024 at 5:52 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Mon, Feb 12, 2024 at 3:42 PM Kumar Kartikeya Dwivedi > > <memxor@xxxxxxxxx> wrote: > > > > > > On Tue, 13 Feb 2024 at 00:34, Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Mon, Feb 12, 2024 at 3:16 PM Ignat Korchagin <ignat@xxxxxxxxxxxxxx> wrote: > > > > > > > > > > [288931.217143][T109754] CPU: 4 PID: 109754 Comm: bpftrace Not tainted > > > > > 6.6.16+ #10 > > > > > > > > ... > > > > > [288931.217143][T109754] ? copy_from_kernel_nofault+0x1d/0xe0 > > > > > [288931.217143][T109754] bpf_probe_read_compat+0x6a/0x90 > > > > > > > > > > And Jakub CCed here did it for 6.8.0-rc2+ > > > > > > > > I suspect something is broken in your kernels. > > > > Above is doing generic copy_from_kernel_nofault(), > > > > so one should be able to crash the kernel without any bpf. > > > > > > > > We have this in selftests/bpf: > > > > __weak noinline struct file *bpf_testmod_return_ptr(int arg) > > > > { > > > > static struct file f = {}; > > > > > > > > switch (arg) { > > > > case 1: return (void *)EINVAL; /* user addr */ > > > > case 2: return (void *)0xcafe4a11; /* user addr */ > > > > case 3: return (void *)-EINVAL; /* canonical, but invalid */ > > > > case 4: return (void *)(1ull << 60); /* non-canonical and invalid */ > > > > case 5: return (void *)~(1ull << 30); /* trigger extable */ > > > > case 6: return &f; /* valid addr */ > > > > case 7: return (void *)((long)&f | 1); /* kernel tricks */ > > > > default: return NULL; > > > > } > > > > } > > > > where we check that extables setup by JIT for bpf progs are working correctly. > > > > You should see the kernel crashing when you just run bpf selftests. > > > > > > I agree, this appears unrelated to BPF since it is happening when > > > using copy_from_kernel_nofault (which should be jumping to the Efault > > > label instead of the oops), but I think it's not specific to some > > > custom kernel. I can reproduce it on my dev machine on top of bpf-next > > > as well, and another machine with Ubuntu's generic 6.5 kernel for > > > 24.04. And I think Ignat tried it on the mainline 6.8-rc2 as well. > > > copy_from_kernel_nofault is called in Jakub's reproducer, but the > panic case in our production seems to be direct memory accessing > according to bpftool dumped jited code. Will faults from such > instructions also be caught correctly? > Yep, since faults in both cases end up in the page fault handler. Once the fix pointed out by Alexei is applied, it should address both scenarios. > Yan > > > Then it must be vsyscall address that this series are fixing: > > https://patchwork.kernel.org/project/netdevbpf/patch/20240202103935.3154011-3-houtao@xxxxxxxxxxxxxxx/ > > > > We're still waiting on x86 maintainers to ack them.