[ Adding x86 / scheduler folks to Cc given PREEMPT_LAZY as-is would cause serious regressions for us. ] On 11/18/24 10:14 AM, Usama Saqib wrote: > Hello, > > I hope everyone is doing well. It seems that work has started to > introduce a new preemption model in the linux kernel PREEMPT_LAZY [1]. > According to the mailing list, the maintainers intend for this to > replace PREEMPT_NONE and PREEMPT_VOLUTARY as the default preemption > model. > > From the changeset, it looks like PREEMPT_LAZY allows > irqentry_exit_cond_resched() to get called on IRQ exit. This change, > similar to PREEMPT_FULL, can get two bpf programs attached to a kprobe > or tracepoint running in user context, to nest. This currently causes > the nesting program to miss. I have been able to get these misses to > happen on top of this new patch. > > This behavior is currently not possible with the default preemption > model used in most distributions, PREEMPT_VOLUNTARY. For many products > using BPF for tracing/security, this would constitute a regression in > terms of reliability. > > My question is whether there is any ongoing work to fix this behavior > of kprobes and tracepoints, so they do not miss on nesting. I have > previously been told that there is ongoing work related to > bpf-specific spinlocks to resolve this problem [2]. Will that be > available by the time this is merged into the mainline, and the > current defaults deprecated? > > Thanks, > Usama Saqib. > > 1. https://lwn.net/ml/all/20241007074609.447006177@xxxxxxxxxxxxx/ > 2. https://lore.kernel.org/bpf/CAOzX8ixsxPbw1ke=DsDd_b38k1TE+JRG3LvJfh4wD60mhHvAqA@xxxxxxxxxxxxxx/T/#m206e33e5a0a0d9d3d498480a53aa9c87c81d91ff On Mon, Nov 18, 2024 at 10:14 AM Usama Saqib <usama.saqib@xxxxxxxxxxxxx> wrote: > > Hello, > > I hope everyone is doing well. It seems that work has started to > introduce a new preemption model in the linux kernel PREEMPT_LAZY [1]. > According to the mailing list, the maintainers intend for this to > replace PREEMPT_NONE and PREEMPT_VOLUTARY as the default preemption > model. > > From the changeset, it looks like PREEMPT_LAZY allows > irqentry_exit_cond_resched() to get called on IRQ exit. This change, > similar to PREEMPT_FULL, can get two bpf programs attached to a kprobe > or tracepoint running in user context, to nest. This currently causes > the nesting program to miss. I have been able to get these misses to > happen on top of this new patch. > > This behavior is currently not possible with the default preemption > model used in most distributions, PREEMPT_VOLUNTARY. For many products > using BPF for tracing/security, this would constitute a regression in > terms of reliability. > > My question is whether there is any ongoing work to fix this behavior > of kprobes and tracepoints, so they do not miss on nesting. I have > previously been told that there is ongoing work related to > bpf-specific spinlocks to resolve this problem [2]. Will that be > available by the time this is merged into the mainline, and the > current defaults deprecated? > > Thanks, > Usama Saqib. > > 1. https://lwn.net/ml/all/20241007074609.447006177@xxxxxxxxxxxxx/ > 2. https://lore.kernel.org/bpf/CAOzX8ixsxPbw1ke=DsDd_b38k1TE+JRG3LvJfh4wD60mhHvAqA@xxxxxxxxxxxxxx/T/#m206e33e5a0a0d9d3d498480a53aa9c87c81d91ff