Re: [PATCH 06/12] uprobes: add batch uprobe register/unregister APIs

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 1 Jul 2024 18:34:55 -0700

On Mon, Jul 1, 2024 at 6:01 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> On Mon, 1 Jul 2024 15:15:56 -0700
> Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
>
> > On Mon, Jul 1, 2024 at 10:55 AM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > On Sat, Jun 29, 2024 at 4:30 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > >
> > > > On Fri, 28 Jun 2024 09:34:26 -0700
> > > > Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
> > > >
> > > > > On Thu, Jun 27, 2024 at 11:28 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Thu, 27 Jun 2024 09:47:10 -0700
> > > > > > Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > >
> > > > > > > On Thu, Jun 27, 2024 at 6:04 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > On Mon, 24 Jun 2024 17:21:38 -0700
> > > > > > > > Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > > -static int __uprobe_register(struct inode *inode, loff_t offset,
> > > > > > > > > -                          loff_t ref_ctr_offset, struct uprobe_consumer *uc)
> > > > > > > > > +int uprobe_register_batch(struct inode *inode, int cnt,
> > > > > > > > > +                       uprobe_consumer_fn get_uprobe_consumer, void *ctx)
> > > > > > > >
> > > > > > > > Is this interface just for avoiding memory allocation? Can't we just
> > > > > > > > allocate a temporary array of *uprobe_consumer instead?
> > > > > > >
> > > > > > > Yes, exactly, to avoid the need for allocating another array that
> > > > > > > would just contain pointers to uprobe_consumer. Consumers would never
> > > > > > > just have an array of `struct uprobe_consumer *`, because
> > > > > > > uprobe_consumer struct is embedded in some other struct, so the array
> > > > > > > interface isn't the most convenient.
> > > > > >
> > > > > > OK, I understand it.
> > > > > >
> > > > > > >
> > > > > > > If you feel strongly, I can do an array, but this necessitates
> > > > > > > allocating an extra array *and keeping it* for the entire duration of
> > > > > > > BPF multi-uprobe link (attachment) existence, so it feels like a
> > > > > > > waste. This is because we don't want to do anything that can fail in
> > > > > > > the detachment logic (so no temporary array allocation there).
> > > > > >
> > > > > > No need to change it, that sounds reasonable.
> > > > > >
> > > > >
> > > > > Great, thanks.
> > > > >
> > > > > > >
> > > > > > > Anyways, let me know how you feel about keeping this callback.
> > > > > >
> > > > > > IMHO, maybe the interface function is better to change to
> > > > > > `uprobe_consumer *next_uprobe_consumer(void **data)`. If caller
> > > > > > side uses a linked list of structure, index access will need to
> > > > > > follow the list every time.
> > > > >
> > > > > This would be problematic. Note how we call get_uprobe_consumer(i,
> > > > > ctx) with i going from 0 to N in multiple independent loops. So if we
> > > > > are only allowed to ask for the next consumer, then
> > > > > uprobe_register_batch and uprobe_unregister_batch would need to build
> > > > > its own internal index and remember ith instance. Which again means
> > > > > more allocations and possibly failing uprobe_unregister_batch(), which
> > > > > isn't great.
> > > >
> > > > No, I think we can use a cursor variable as;
> > > >
> > > > int uprobe_register_batch(struct inode *inode,
> > > >                  uprobe_consumer_fn get_uprobe_consumer, void *ctx)
> > > > {
> > > >         void *cur = ctx;
> > > >
> > > >         while ((uc = get_uprobe_consumer(&cur)) != NULL) {
> > > >                 ...
> > > >         }
> > > >
> > > >         cur = ctx;
> > > >         while ((uc = get_uprobe_consumer(&cur)) != NULL) {
> > > >                 ...
> > > >         }
> > > > }
> > > >
> > > > This can also remove the cnt.
> > >
> > > Ok, if you prefer this I'll switch. It's a bit more cumbersome to use
> > > for callers, but we have one right now, and might have another one, so
> > > not a big deal.
> > >
> >
> > Actually, now that I started implementing this, I really-really don't
> > like it. In the example above you assume by storing and reusing
> > original ctx value you will reset iteration to the very beginning.
> > This is not how it works in practice though. Ctx is most probably a
> > pointer to some struct somewhere with iteration state (e.g., array of
> > all uprobes + current index), and so get_uprobe_consumer() doesn't
> > update `void *ctx` itself, it updates the state of that struct.
>
> Yeah, that should be noted so that if the get_uprobe_consumer() is
> called with original `ctx` value, it should return the same.
> Ah, and I found we need to pass both ctx and pos...
>
>        while ((uc = get_uprobe_consumer(&cur, ctx)) != NULL) {
>                  ...
>          }
>
> Usually it is enough to pass the cursor as similar to the other
> for_each_* macros. For example, struct foo has .list and .uc, then
>
> struct uprobe_consumer *get_uprobe_consumer_foo(void **pos, void *head)
> {
>         struct foo *foo = *pos;
>
>         if (!foo)
>                 return NULL;
>
>         *pos = list_next_entry(foo, list);
>         if (list_is_head(pos, (head)))
>                 *pos = NULL;
>
>         return foo->uc;
> }
>
> This works something like this.
>
> #define for_each_uprobe_consumer_from_foo(uc, pos, head) \
>         list_for_each_entry(pos, head, list) \
>                 if (uc = uprobe_consumer_from_foo(pos))
>
> or, for array of *uprobe_consumer (array must be end with NULL),
>
> struct uprobe_consumer *get_uprobe_consumer_array(void **pos, void *head __unused)
> {
>         struct uprobe_consumer **uc = *pos;
>
>         if (!*uc)
>                 return NULL;
>
>         *pos = uc + 1;
>
>         return *uc;
> }
>
> But this may not be able to support array of uprobe_consumer. Hmm.
>
>
> > And so there is no easy and clean way to reset this iterator without
> > adding another callback or something. At which point it becomes quite
> > cumbersome and convoluted.
>
> If you consider that is problematic, I think we can prepare more
> iterator like object;
>
> struct uprobe_consumer_iter_ops {
>         struct uprobe_consumer *(*start)(struct uprobe_consumer_iter_ops *);
>         struct uprobe_consumer *(*next)(struct uprobe_consumer_iter_ops *);
>         void *ctx; // or, just embed the data in this structure.
> };
>

Yeah, I was thinking about something like this for adding a proper
iterator-based interface.

>
> > How about this? I'll keep the existing get_uprobe_consumer(idx, ctx)
> > contract, which works for the only user right now, BPF multi-uprobes.
> > When it's time to add another consumer that works with a linked list,
> > we can add another more complicated contract that would do
> > iterator-style callbacks. This would be used by linked list users, and
> > we can transparently implement existing uprobe_register_batch()
> > contract on top of if by implementing a trivial iterator wrapper on
> > top of get_uprobe_consumer(idx, ctx) approach.
>
> Agreed, anyway as far as it uses an array of uprobe_consumer, it works.
> When we need to register list of the structure, we may be possible to
> allocate an array or introduce new function.
>

Cool, glad we agree. What you propose above with start + next + ctx
seems like a way forward if we need this.

BTW, is this (batched register/unregister APIs) something you'd like
to use from the tracefs-based (or whatever it's called, I mean non-BPF
ones) uprobes as well? Or there is just no way to even specify a batch
of uprobes? Just curious if you had any plans for this.

> Thank you!
>
> >
> > Let's not add unnecessary complications right now given we have a
> > clear path forward to add it later, if necessary, without breaking
> > anything. I'll send v2 without changes to get_uprobe_consumer() for
> > now, hopefully my above plan makes sense to you. Thanks!
> >
> > > >
> > > > Thank you,
> > > >
> > > > >
> > > > > For now this API works well, I propose to keep it as is. For linked
> > > > > list case consumers would need to allocate one extra array or pay the
> > > > > price of O(N) search (which might be ok, depending on how many uprobes
> > > > > are being attached). But we don't have such consumers right now,
> > > > > thankfully.
> > > > >
> > > > > >
> > > > > > Thank you,
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > >
> > > > > > > > --
> > > > > > > > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
> > > >
> > > >
> > > > --
> > > > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>