Re: Page faults in tracepoint caused by aliased pointer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 12, 2024 at 5:52 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Feb 12, 2024 at 3:42 PM Kumar Kartikeya Dwivedi
> <memxor@xxxxxxxxx> wrote:
> >
> > On Tue, 13 Feb 2024 at 00:34, Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Mon, Feb 12, 2024 at 3:16 PM Ignat Korchagin <ignat@xxxxxxxxxxxxxx> wrote:
> > > >
> > > > [288931.217143][T109754] CPU: 4 PID: 109754 Comm: bpftrace Not tainted
> > > > 6.6.16+ #10
> > >
> > > ...
> > > > [288931.217143][T109754]  ? copy_from_kernel_nofault+0x1d/0xe0
> > > > [288931.217143][T109754]  bpf_probe_read_compat+0x6a/0x90
> > > >
> > > > And Jakub CCed here did it for 6.8.0-rc2+
> > >
> > > I suspect something is broken in your kernels.
> > > Above is doing generic copy_from_kernel_nofault(),
> > > so one should be able to crash the kernel without any bpf.
> > >
> > > We have this in selftests/bpf:
> > > __weak noinline struct file *bpf_testmod_return_ptr(int arg)
> > > {
> > >         static struct file f = {};
> > >
> > >         switch (arg) {
> > >         case 1: return (void *)EINVAL;          /* user addr */
> > >         case 2: return (void *)0xcafe4a11;      /* user addr */
> > >         case 3: return (void *)-EINVAL;         /* canonical, but invalid */
> > >         case 4: return (void *)(1ull << 60);    /* non-canonical and invalid */
> > >         case 5: return (void *)~(1ull << 30);   /* trigger extable */
> > >         case 6: return &f;                      /* valid addr */
> > >         case 7: return (void *)((long)&f | 1);  /* kernel tricks */
> > >         default: return NULL;
> > >         }
> > > }
> > > where we check that extables setup by JIT for bpf progs are working correctly.
> > > You should see the kernel crashing when you just run bpf selftests.
> >
> > I agree, this appears unrelated to BPF since it is happening when
> > using copy_from_kernel_nofault (which should be jumping to the Efault
> > label instead of the oops), but I think it's not specific to some
> > custom kernel. I can reproduce it on my dev machine on top of bpf-next
> > as well, and another machine with Ubuntu's generic 6.5 kernel for
> > 24.04. And I think Ignat tried it on the mainline 6.8-rc2 as well.
>
copy_from_kernel_nofault is called in Jakub's reproducer, but the
panic case in our production seems to be direct memory accessing
according to bpftool dumped jited code. Will faults from such
instructions also be caught correctly?

Yan

> Then it must be vsyscall address that this series are fixing:
> https://patchwork.kernel.org/project/netdevbpf/patch/20240202103935.3154011-3-houtao@xxxxxxxxxxxxxxx/
>
> We're still waiting on x86 maintainers to ack them.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux