On Fri, Jun 11, 2021 at 11:45 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Jun 10, 2021 at 11:42:24PM -0700, Cong Wang wrote: > > On Thu, Jun 10, 2021 at 9:27 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > Please stick to one email thread in the future, ok? > > I'll consolidate them here: > > > What is your use case to justify your own code? Asking because > > you deny mine, so clearly my use case is not yours. > > I mentioned several use cases in the prior threads. > tldr: any periodic event in tracing, networking, security. > Garbage collection falls into this category as well, but please internalize > that implementing conntrack as it is today in the kernel is an explicit non-goal. You need to read my use case again, it is for the conntrack in Cilium, not the kernel one. > > > And more importantly, why not just use BPF_TEST_RUN with > > a user-space timer? Asking because you offer no API to read or > > modify timer expiration, so literally the same with BPF_TEST_RUN > > approach. > > a wrapper on top of hrtimer_get_remaining() like bpf_timer_get_remaining() > is trivial to add, but what is the use case? If you do not have any use case, then stick to BPF_TEST_RUN with user-space timers? And of course your patches are not needed at all. > > > > > > > Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded > > > in hash/array/lru maps as regular field and helpers to operate on it: > > > > Can be or has to be? Huge difference here. > > map elements don't have to use timers. You interpret this in a wrong way, what I asked is whether a bpf timer has to be embedded in a map. IOW, can a bpf timer be a standalone global data? > > > In the other thread, you said it is global data, which implies that it does > > not have to be in a map. > > global data is a map. That was explained in the prior thread as well. > I think you implied bpf timer can exist without a map, hence I am asking. > > > > In your test case or your example, all timers are still in a map. So what has > > changed since then? Looks nothing to me. > > look again? Yes, I just looked at it again, only more confusing, not less. > > > Hmm, finally you begin refcounting it, which you were strongly against. ;) > > That was already answered in the prior thread. > tldr: there were two options. This is one of them. Another can be added > in the future as well. > > > Three questions: > > > > 1. Can t->prog be freed between bpf_timer_init() and bpf_timer_start()? > > yes. Good. So if a program which only initializes the timer and then exits, the other program which shares this timer will crash when it calls bpf_timer_start(), right? > > > If the timer subprog is always in the same prog which installs it, then > > installs it? I'm not following the quesiton. > > > this is fine. But then how do multiple programs share a timer? > > there is only one callback function. That's exactly my question. How is one callback function shared by multiple eBPF programs which want to share the timer? > > > In the > > case of conntrack, either ingress or egress could install the timer, > > it only depends which one gets traffic first. Do they have to copy > > the same subprog for the same timer? > > conntrack is an explicit non-goal. I interpret this as you do not want timers to be shared by multiple eBPF programs, correct? Weirdly, maps are shared, timers are placed in a map, so timers should be shared naturally too. > > > > > 2. Can t->prog be freed between a timer expiration and bpf_timer_start() > > again? > > If it's already armed with the first bpf_timer_start() it won't be freed. Why? I see t->prog is released in your timer callback: + bpf_prog_put(prog); + this_cpu_write(hrtimer_running, NULL); + return HRTIMER_NORESTART; > > > It gets a refcnt when starting a timer and puts it when cancelling > > or expired, so t->prog can be freed right after cancelling or expired. What > > if another program which shares this timer wants to restart this timer? > > There is only one callback_fn per timer. Another program can share > the struct bpf_timer and the map. It might have subprog callback_fn code > that looks exactly the same as callback_fn in the first prog. > For example when libbpf loads progs/timer.c (after it was compiled into .o) > it might share a subprog in the future (when kernel has support for > dynamic linking). From bpf user pov it's a single .c file. > The split into programs and subprograms is an implemenation detail > that C programmer doesn't need to worry about. Not exactly, they share a same C file but still can be loaded/unloaded separately. And logically, whether a timer has been initialized once or twice makes a huge difference for programers. > > > 3. Since you offer no API to read the expiration time, why not just use > > BPF_TEST_RUN with a user-space timer? This is preferred by Andrii. > > Andrii point was that there should be no syscall cmds that replicate > bpf_timer_init/start/cancel helpers. I agree with this. Actually there is no strong reason to bother a bpf timer unless you want to access the timer itself, which mostly contains expiration. User-space timers work just fine for your cases, even if not, extending BPF_TEST_RUN should too. > > > > Thanks. > > > > Another unpopular point of view: > > > > This init() is not suitable for bpf programs, because unlike kernel modules, > > there is no init or exit functions for a bpf program. And timer init > > is typically > > called during module init. > > Already answerd this in the prior thread. There will be __init and __fini like > subprograms in bpf progs. I interpret this as init does not make sense until we have __init and __fini? > > Please apply the patches to your local tree and do few experiments based > on selftests/bpf/progs/timer.c. I think experimenting with the code > will answer all of your questions. Sounds like you find a great excuse for a failure of documentation. What I asked are just fundamental design questions you should have covered in your cover letter, which is literally empty. Thanks.