Re: [PATCH bpf-next 1/3] bpf: Introduce bpf_timer

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Thu, 03 Jun 2021 00:21:05 +0200

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:

> On Thu, May 27, 2021 at 06:57:17PM +0200, Toke Høiland-Jørgensen wrote:
>> >     if (val) {
>> >         bpf_timer_init(&val->timer, timer_cb2, 0);
>> >         bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 msec */);
>> 
>> nit: there are 1M nanoseconds in a millisecond :)
>
> oops :)
>
>> >     }
>> > }
>> >
>> > This patch adds helper implementations that rely on hrtimers
>> > to call bpf functions as timers expire.
>> > The following patch adds necessary safety checks.
>> >
>> > Only programs with CAP_BPF are allowed to use bpf_timer.
>> >
>> > The amount of timers used by the program is constrained by
>> > the memcg recorded at map creation time.
>> >
>> > The bpf_timer_init() helper is receiving hidden 'map' and 'prog' arguments
>> > supplied by the verifier. The prog pointer is needed to do refcnting of bpf
>> > program to make sure that program doesn't get freed while timer is armed.
>> >
>> > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
>> 
>> Overall this LGTM, and I believe it will be usable for my intended use
>> case. One question:
>> 
>> With this, it will basically be possible to create a BPF daemon, won't
>> it? I.e., if a program includes a timer and the callback keeps re-arming
>> itself this will continue indefinitely even if userspace closes all refs
>> to maps and programs? Not saying this is a problem, just wanted to check
>> my understanding (i.e., that there's not some hidden requirement on
>> userspace keeping a ref open that I'm missing)...
>
> That is correct.
> Another option would be to auto-cancel the timer when the last reference
> to the prog drops. That may feel safer, since forever
> running bpf daemon is a certainly new concept.
> The main benefits of doing prog_refcnt++ from bpf_timer_start are ease
> of use and no surprises.
> Disappearing timer callback when prog unloads is outside of bpf prog control.
> For example the tracing bpf prog might collect some data and periodically
> flush it to user space. The prog would arm the timer and when callback
> is invoked it would send the data via ring buffer and start another
> data collection cycle.
> When the user space part of the service exits it doesn't have
> an ability to enforce the flush of the last chunk of data.
> It could do prog_run cmd that will call the timer callback,
> but it's racy.
> The solution to this problem could be __init and __fini
> sections that will be invoked when the prog is loaded
> and when the last refcnt drops.
> It's a complementary feature though.
> The prog_refcnt++ from bpf_timer_start combined with a prog
> explicitly doing bpf_timer_cancel from __fini
> would be the most flexible combination.
> This way the prog can choose to be a daemon or it can choose
> to cancel its timers and do data flushing when the last prog
> reference drops.
> The prog refcnt would be split (similar to uref). The __fini callback
> will be invoked when refcnt reaches zero, but all increments
> done by bpf_timer_start will be counted separately.
> The user space wouldn't need to do the prog_run command.
> It would detach the prog and close(prog_fd).
> That will trigger __fini callback that will cancel the timers
> and the prog will be fully unloaded.
> That would make bpf progs resemble kernel modules even more.

I like the idea of a "destructor" that will trigger on refcnt drop to
zero. And I do think a "bpf daemon" is potentially a useful, if novel,
concept.

The __fini thing kinda supposes a well-behaved program, though, right?
I.e., it would be fairly trivial to write a program that spins forever
by repeatedly scheduling the timer with a very short interval (whether
by malice or bugginess). So do we need a 'bpfkill' type utility to nuke
buggy programs, or how would resource constraints be enforced?

-Toke