On Tue, Jun 29, 2021 at 4:28 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Jun 28, 2021 at 11:34 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > Have you considered alternatively to implement something like > > bpf_ringbuf_query() for BPF ringbuf that will allow to query various > > things about the timer (e.g., whether it is active or not, and, of > > course, remaining expiry time). That will be more general, easier to > > extend, and will cover this use case: > > > > long exp = bpf_timer_query(&t->timer, BPF_TIMER_EXPIRY); > > bpf_timer_start(&t->timer, new_callback, exp); > > yes, but... > hrtimer_get_remaining + timer_start to that value is racy > and not accurate. yes, but even though we specify expiration in nanosecond precision, no one should expect that precision w.r.t. when callback is actually fired. So fetching current expiration, adding new one, and re-setting it shouldn't be a problem in practice, IMO. I just think the most common case is to set a timer once, so ideally usability is optimized for that (so taken to extreme it would be just bpf_timer_start without any bpf_timer_init, but we've already discussed this, no need to do that again here). Needing bpf_timer_init + bpf_timer_set_callbcack + bpf_timer_start for a common case feels suboptimal usability-wise. There is also a new race with bpf_timer_set_callback + bpf_timer_start. Callback can fire inbetween those two operations, so we could get new callback at old expiration or old callback with new expiration. To do full update reliably, you'd need to explicitly bpf_timer_cancel() first, at which point separate bpf_timer_set_callback() doesn't help at all. > hrtimer_get_expires_ns + timer_start(MODE_ABS) > would be accurate, but that's an unnecessary complication. > To live replace old bpf prog with new one > bpf_for_each_map_elem() { bpf_timer_set_callback(new_prog); } > is much faster, since timers don't need to be dequeue, enqueue. > No need to worry about hrtimer machinery internal changes, etc. > bpf prog being replaced shouldn't be affecting the rest of the system. That's a good property, but if it was done as a bpf_timer_set_callback() in addition to current bpf_timer_start(callback_fn) it would still allow to have a simple typical use. Another usability consideration. With mandatory bpf_timer_set_callback(), bpf_timer_start() will need to return some error code if the callback wasn't set yet, right? I'm afraid that in practice it will be the situation similar to bpf_trace_printk() where people expect that it always succeeds and will never check the return code. It's obviously debuggable, but a friction point nevertheless. > > > This will keep common timer scenarios to just two steps, init + start, > > but won't prevent more complicated ones. Things like extending > > expiration by one second relative that what was remaining will be > > possible as well. > > Extending expiration would be more accurate with hrtimer_forward_now(). > > All of the above points are minor compared to the verifier advantage. > bpf_timer_set_callback() typically won't be called from the callback. > So verifier's insn_procssed will be drastically lower. > The combinatorial explosion of states even for this small > selftests/bpf/progs/timer.c is significant. > With bpf_timer_set_callback() is done outside of callback the verifier > behavior will be predictable. > To some degree patches 4-6 could have been delayed, but since the > the algo is understood and it's working, I'm going to keep them. > It's nice to have that flexibility, but the less pressure on the > verifier the better. I haven't had time to understand those new patches yet, sorry, so not sure where the state explosion is coming from. I'll get to it for real next week. But improving verifier internals can be done transparently, while changing/fixing BPF UAPI is much harder and more disruptive.