Re: BPF timers in hard irq context?

Barret Rhoden <brho@xxxxxxxxxx> · Thu, 23 May 2024 12:42:48 -0400

On 5/22/24 16:03, Alexei Starovoitov wrote:
On Tue, May 21, 2024 at 2:59 PM Barret Rhoden <brho@xxxxxxxxxx> wrote:

hi -

we've noticed some variability in bpf timer expiration that goes away if
we change the timers to run in hardirq context.

What kind of variability are we talking about?

hmm - it's actually worse than just variability.  the issue is that 
we're using the timer to implement scheduling policy.  yet the timer 
sometimes gets handled by ksoftirqd.  and ksoftirqd relies on the 
scheduling policy to run.  we end up with a circular dependence.

e.g. say we want to let a very high priority thread run for 50us. 
ideally we'd just set a timer for 50us and force a context switch when 
it goes off.

but if timers might require ksoftirqd to run, we'll have to treat that 
ksoftirqd specially (always run ksoftirqd if it is runnable), and then 
we won't be able to let the high prio thread run ahead of other, less 
important softirqs.

i imagine the use of softirqs was to keep the potentially long-running
timer callback out of hardirq, but is there anything particularly
dangerous about making them run in hardirq?

exactly what you said. We don't have a good mechanism to
keep bpf prog runtime tiny enough for hardirq.

i think stuff like the scheduler tick, and any bpf progs that run there 
are also run in hardirq.  let alone tracing progs.  so maybe if we've 
already opened the gates to hardirq progs, then maybe letting timers run 
there too would be ok?  perhaps with CAP_BPF.

barret