Re: Why is recursion protection needed in bpf syscalls?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for the information.

> Folks are working on adding htab-like
> protection to other map types and there is an orthogonal effort
> to introduce bpf specific spinlock with run-time deadlock protection
> that bpf maps and progs will use.

A few followup questions regarding this if you don't mind.
1. Would this allow to remove the recursion protection which prevents
kprobes and tracepoints from nesting?
2. What exactly is the reason for using a per-cpu bpf_prog_active to
perform recursion protection for kprobes and tracepoints, as opposed
to the more granular approach taken by raw_tracepoints or
bpf_trampolines?
    2a. Is nesting protection required at all for raw_tracepoints?
Atleast for raw_tracepoints the protection seems to have been
introduced to address the WARN_ON_ONCE when the allowed nesting level
in try_get_fmt_tmp_buf() is exceeded? Why not just remove the warning
if nesting is possible and expected?

Thanks,
Usama Saqib.



On Fri, Jun 14, 2024 at 10:00 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Jun 12, 2024 at 3:38 AM Usama Saqib <usama.saqib@xxxxxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > Some map operations via syscalls on hash maps (and some others)
> > disable bpf programs from running on the same CPU with
> > bpf_disable_instrumentation. The provided reason for this is to
> > prevent deadlocks when a nested bpf program tries to access an already
> > held bucket lock. From my understanding, this can happen due to a
> > kprobe on a function called after the lock is acquired. However,
> > htab_lock_bucket already handles this case by returning EBUSY if such
> > a scenario were to happen. Is there any other reason for disabling bpf
> > programs on the CPU?
>
> Correct. bpf_disable_instrumentation() is a mechanism to prevent
> bpf-kprobe progs being invoked from the inner places of bpf maps.
> htab has a separate protection via htab_lock_bucket.
> array map doesn't need such thing, but there are other map types.
> disable_instrumentation() is mainly for those.
>
> > The effect of this is that 1) bpf programs attached to a kprobe or
> > tracepoint in an irq context get skipped while inside
> > bpf_[enable,disable]_instrumentation block but before the
> > preempt_disable via htab_lock_bucket, 2) when CONFIG_PREEMPTION=y and
> > preempt=full then a bpf program running from user context may also get
> > skipped while inside the bpf_[enable,disable]_instrumentation block.
>
> Yes. It is unfortunate. Folks are working on adding htab-like
> protection to other map types and there is an orthogonal effort
> to introduce bpf specific spinlock with run-time deadlock protection
> that bpf maps and progs will use.
> Once it's available this disable_instrumention logic can be lifted.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux