Re: BPF and lazy preemption.

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Tue, 10 Dec 2024 15:14:13 +0100

On Tue, Dec 10, 2024 at 02:25:20PM +0100, Usama Saqib wrote:
> [ Adding x86 / scheduler folks to Cc given PREEMPT_LAZY as-is would cause
>   serious regressions for us. ]
> 
> On 11/18/24 10:14 AM, Usama Saqib wrote:
> > Hello,
> >
> > I hope everyone is doing well. It seems that work has started to
> > introduce a new preemption model in the linux kernel PREEMPT_LAZY [1].
> > According to the mailing list, the maintainers intend for this to
> > replace PREEMPT_NONE and PREEMPT_VOLUTARY as the default preemption
> > model.
> >
> >  From the changeset, it looks like PREEMPT_LAZY allows
> > irqentry_exit_cond_resched() to get called on IRQ exit. This change,
> > similar to PREEMPT_FULL, can get two bpf programs attached to a kprobe
> > or tracepoint running in user context, to nest. This currently causes
> > the nesting program to miss. I have been able to get these misses to
> > happen on top of this new patch.
> >
> > This behavior is currently not possible with the default preemption
> > model used in most distributions, PREEMPT_VOLUNTARY. For many products
> > using BPF for tracing/security, this would constitute a regression in
> > terms of reliability.
> >
> > My question is whether there is any ongoing work to fix this behavior
> > of kprobes and tracepoints, so they do not miss on nesting. I have
> > previously been told that there is ongoing work related to
> > bpf-specific spinlocks to resolve this problem [2]. Will that be
> > available by the time this is merged into the mainline, and the
> > current defaults deprecated?

I have no idea about the whole BPF thing, but if behaviour is as
PREEMPT_FULL, then there is nothing to fix from a scheduler PoV.

Note that most distros already build with PREEMPT_DYNAMIC, which allows
users/admins to dynamically select the preemption model (either at boot
or at runtime through debugfs).

If certain BPF stuff cannot deal with full preemption, then I would have
to call it broken.