Re: [RFC PATCH bpf-next 10/17] bpf: Add support to attach program to multiple trampolines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 25, 2022 at 07:35:44PM -0700, Andrii Nakryiko wrote:
> On Thu, Aug 25, 2022 at 10:44 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Thu, Aug 25, 2022 at 9:08 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
> > >
> > > On Tue, Aug 23, 2022 at 06:22:37PM -0700, Alexei Starovoitov wrote:
> > > > On Mon, Aug 08, 2022 at 04:06:19PM +0200, Jiri Olsa wrote:
> > > > > Adding support to attach program to multiple trampolines
> > > > > with new attach/detach interface:
> > > > >
> > > > >   int bpf_trampoline_multi_attach(struct bpf_tramp_prog *tp,
> > > > >                                   struct bpf_tramp_id *id)
> > > > >   int bpf_trampoline_multi_detach(struct bpf_tramp_prog *tp,
> > > > >                                   struct bpf_tramp_id *id)
> > > > >
> > > > > The program is passed as bpf_tramp_prog object and trampolines to
> > > > > attach it to are passed as bpf_tramp_id object.
> > > > >
> > > > > The interface creates new bpf_trampoline object which is initialized
> > > > > as 'multi' trampoline and stored separtely from standard trampolines.
> > > > >
> > > > > There are following rules how the standard and multi trampolines
> > > > > go along:
> > > > >   - multi trampoline can attach on top of existing single trampolines,
> > > > >     which creates 2 types of function IDs:
> > > > >
> > > > >       1) single-IDs - functions that are attached within existing single
> > > > >          trampolines
> > > > >       2) multi-IDs  - functions that were 'free' and are now taken by new
> > > > >          'multi' trampoline
> > > > >
> > > > >   - we allow overlapping of 2 'multi' trampolines if they are attached
> > > > >     to same IDs
> > > > >   - we do now allow any other overlapping of 2 'multi' trampolines
> > > > >   - any new 'single' trampoline cannot attach to existing multi-IDs IDs.
> > > > >
> > > > > Maybe better explained on following example:
> > > > >
> > > > >    - you want to attach program P to functions A,B,C,D,E,F
> > > > >      via bpf_trampoline_multi_attach
> > > > >
> > > > >    - D,E,F already have standard trampoline attached
> > > > >
> > > > >    - the bpf_trampoline_multi_attach will create new 'multi' trampoline
> > > > >      which spans over A,B,C functions and attach program P to single
> > > > >      trampolines D,E,F
> > > > >
> > > > >    - A,B,C functions are now 'not attachable' by any trampoline
> > > > >      until the above 'multi' trampoline is released
> > > >
> > > > This restriction is probably too severe.
> > > > Song added support for trampoline and KLPs to co-exist on the same function.
> > > > This multi trampoline restriction will resurface the same issue.
> > > > afiak this restriction is only because multi trampoline image
> > > > is the same for A,B,C. This memory optimization is probably going too far.
> > > > How about we keep existing logic of one tramp image per function.
> > > > Pretend that multi-prog P matches BTF of the target function,
> > > > create normal tramp for it and attach prog P there.
> > > > The prototype of P allows six u64. The args are potentially rearding
> > > > garbage, but there are no safety issues, since multi progs don't know BTF types.
> > > >
> > > > We still need sinle bpf_link_multi to contain btf_ids of all functions,
> > > > but it can point to many bpf tramps. One for each attach function.
> > > >
> > > > iirc we discussed something like this long ago, but I don't remember
> > > > why we didn't go that route.
> > > > arch_prepare_bpf_trampoline is fast.
> > > > bpf_tramp_image_alloc is fast too.
> > > > So attaching one multi-prog to thousands of btf_id-s should be fast too.
> > > > The destroy part is interesting.
> > > > There we will be doing thousands of bpf_tramp_image_put,
> > > > but it's all async now. We used to have synchronize_rcu() which could
> > > > be the reason why this approach was slow.
> > > > Or is this unregister_fentry that slows it down?
> > > > But register_ftrace_direct_multi() interface should have solved it
> > > > for both register and unregister?
> > >
> > > I think it's the synchronize_rcu_tasks at the end of each ftrace update,
> > > that's why we added un/register_ftrace_direct_multi that makes the changes
> > > for multiple ips and syncs once at the end
> >
> > hmm. Can synchronize_rcu_tasks be made optional?
> > For ftrace_direct that points to bpf tramps is it really needed?
> >
> > > un/register_ftrace_direct_multi will attach/detach multiple multiple ips
> > > to single address (trampoline), so for this approach we would need to add new
> > > ftrace direct api that would allow to set multiple ips to multiple trampolines
> > > within one call..
> >
> > right
> >
> > > I was already checking on that and looks doable
> >
> > awesome.
> >
> > > another problem might be that this update function will need to be called with
> > > all related trampoline locks, which in this case would be thousands
> >
> > sure. but these will be newly allocated trampolines and
> > brand new mutexes, so no contention.
> > But thousands of cmpxchg-s will take time. Would be good to measure
> > though. It might not be that bad.
> 
> What about the memory overhead of thousands of trampolines and
> trampoline images? Seems very wasteful to create one per each attach,
> when each attachment in general will be identical.
> 
> 
> If I remember correctly, last time we were also discussing creating a
> generic BPF trampoline that would save all 6 input registers,
> regardless of function's BTF signature. Such BPF trampoline should
> support calling both generic fentry/fexit programs and typed ones,
> because all the necessary data is stored on the stack correctly.
> 
> For the case when typed (non-generic) BPF trampoline is already
> attached to a function and now we are attaching generic fentry, why
> can't we "upgrade" existing BPF trampoline to become generic, and then
> just add generic multi-fentry program to its trampoline image? Once
> that multi-fentry is detached, we might choose to convert trampoline
> back to typed BPF trampoline (i.e., save only necessary registers, not
> all 6 of them), but that's more like an optimization, it doesn't have
> to happen.
> 
> Or is there something that would make such generic trampoline impossible?
> 
> If we go with this approach, then each multi-fentry attachment will be
> creating minimum amount of trampolines, determined by all the
> combinations of attached programs at that point. If after we attach
> multi-fentry to some set of functions we need to attach another
> multi-fentry or typed fentry, we'd potentially need to split
> trampolines and create a bit more of them. But while that sounds a bit
> complicated, we do all that under locks so there isn't much problem in
> doing that, no?
> 
> But in general, I agree with Alexei that this restriction on not being
> able to attach to a function once multi-attach trampoline is attached
> to it is a really-really bad restriction in production, where we can't
> control exactly what BPF apps run and in which order.

ah ok.. attaching single trampoline on top of attached multi trampoline
should be possible to add.. as long as one side of the problem is single
trampoline it should be doable, I'll check

leaving the restriction only to attaching one multi trampoline over
another (not equal) attached multi trampoline

would that be acceptable?

> 
> P.S. I think this generic typeless BPF trampoline is a useful thing in
> itself and we are half-way there already with bpf_get_func_ip() and
> bpf_get_func_arg_cnt() helpers and storing such "parameters" on the
> stack, so tbh, we can probably split the problem into two and try to
> address a somewhat simpler and more straightforward generic BPF
> trampoline first. Such generic type-less BPF trampoline will make
> fentry a better and more generic alternative to kprobe, by being less
> demanding about specifying BTF ID (even if we don't care about input
> argument types) yet faster to trigger than kprobe.

yes, with the help of those helpers the only 'generic' thing for
trampoline is its BTF type

jirka



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux