On Thu, May 23, 2024 at 9:42 AM Barret Rhoden <brho@xxxxxxxxxx> wrote: > > On 5/22/24 16:03, Alexei Starovoitov wrote: > > On Tue, May 21, 2024 at 2:59 PM Barret Rhoden <brho@xxxxxxxxxx> wrote: > >> > >> hi - > >> > >> we've noticed some variability in bpf timer expiration that goes away if > >> we change the timers to run in hardirq context. > > > > What kind of variability are we talking about? > > hmm - it's actually worse than just variability. the issue is that > we're using the timer to implement scheduling policy. yet the timer > sometimes gets handled by ksoftirqd. and ksoftirqd relies on the > scheduling policy to run. we end up with a circular dependence. > > e.g. say we want to let a very high priority thread run for 50us. > ideally we'd just set a timer for 50us and force a context switch when > it goes off. > > but if timers might require ksoftirqd to run, we'll have to treat that > ksoftirqd specially (always run ksoftirqd if it is runnable), and then > we won't be able to let the high prio thread run ahead of other, less > important softirqs. Understood. That's fair enough. It's not jitter, but that softirq in general cannot satisfy the requirement. Please add this explanation to the commit log. I think another example would be to implement a watchdog with bpf_timer in hardirq for things that run in softirq like napi. > >> i imagine the use of softirqs was to keep the potentially long-running > >> timer callback out of hardirq, but is there anything particularly > >> dangerous about making them run in hardirq? > > > > exactly what you said. We don't have a good mechanism to > > keep bpf prog runtime tiny enough for hardirq. > > i think stuff like the scheduler tick, and any bpf progs that run there > are also run in hardirq. let alone tracing progs. so maybe if we've > already opened the gates to hardirq progs, then maybe letting timers run > there too would be ok? perhaps with CAP_BPF. bpf_timer already requires cap_bpf. No need for extra restrictions.