On Fri, 1 Nov 2024 at 01:00, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > More context is available in [0], but the TLDR; is that the verifier > incorrectly assumes that any raw tracepoint argument will always be > non-NULL. This means that even when users correctly check possible NULL > arguments, the verifier can remove the NULL check due to incorrect > knowledge of the NULL-ness of the pointer. Secondly, kernel helpers or > kfuncs taking these trusted tracepoint arguments incorrectly assume that > all arguments will always be valid non-NULL. > > In this set, we mark raw_tp arguments as PTR_MAYBE_NULL on top of > PTR_TRUSTED, but special case their behavior when dereferencing them or > pointer arithmetic over them is involved. When passing trusted args to > helpers or kfuncs, raw_tp programs are permitted to pass possibly NULL > pointers in such cases. > > Any loads into such maybe NULL trusted PTR_TO_BTF_ID is promoted to a > PROBE_MEM load to handle emanating page faults. The verifier will ensure > NULL checks on such pointers are preserved and do not lead to dead code > elimination. > > This new behavior is not applied when ref_obj_id is non-zero, as those > pointers do not belong to raw_tp arguments, but instead acquired > objects. > > Since helpers and kfuncs already require attention for PTR_TO_BTF_ID > (non-trusted) pointers, we do not implement any protection for such > cases in this patch set, and leave it as future work for an upcoming > series. > > A selftest is included with this patch set to verify the new behavior, > and it crashes the kernel without the first patch. I see that all selftests except one passed. The one that didn't appears to have been cancelled after running for an hour, and stalled after select_reuseport:OK. Looking at the LLVM 18 (https://github.com/kernel-patches/bpf/actions/runs/11621768944/job/32366412581?pr=7999) run instead of LLVM 17 (https://github.com/kernel-patches/bpf/actions/runs/11621768944/job/32366400714?pr=7999, which failed), it seems the next test send_signal_tracepoint. Is this known to be flaky? I'm guessing not and it is probably caused by my patch, but just want to confirm before I begin debugging. > > [...] >