Re: [PATCH bpf-next v2 3/6] bpf: support attaching freplace programs to multiple attach points

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 16, 2020 at 7:06 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Thu, Jul 16, 2020 at 12:50:05PM +0200, Toke Høiland-Jørgensen wrote:
> > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes:
> >
> > > On Wed, Jul 15, 2020 at 03:09:02PM +0200, Toke Høiland-Jørgensen wrote:
> > >>
> > >> +  if (tgt_prog_fd) {
> > >> +          /* For now we only allow new targets for BPF_PROG_TYPE_EXT */
> > >> +          if (prog->type != BPF_PROG_TYPE_EXT ||
> > >> +              !btf_id) {
> > >> +                  err = -EINVAL;
> > >> +                  goto out_put_prog;
> > >> +          }
> > >> +          tgt_prog = bpf_prog_get(tgt_prog_fd);
> > >> +          if (IS_ERR(tgt_prog)) {
> > >> +                  err = PTR_ERR(tgt_prog);
> > >> +                  tgt_prog = NULL;
> > >> +                  goto out_put_prog;
> > >> +          }
> > >> +
> > >> +  } else if (btf_id) {
> > >> +          err = -EINVAL;
> > >> +          goto out_put_prog;
> > >> +  } else {
> > >> +          btf_id = prog->aux->attach_btf_id;
> > >> +          tgt_prog = prog->aux->linked_prog;
> > >> +          if (tgt_prog)
> > >> +                  bpf_prog_inc(tgt_prog); /* we call bpf_prog_put() on link release */
> > >
> > > so the first prog_load cmd will beholding the first target prog?
> > > This is complete non starter.
> > > You didn't mention such decision anywhere.
> > > The first ext prog will attach to the first dispatcher xdp prog,
> > > then that ext prog will multi attach to second dispatcher xdp prog and
> > > the first dispatcher prog will live in the kernel forever.
> >
> > Huh, yeah, you're right that's no good. Missing that was a think-o on my
> > part, sorry about that :/
> >
> > > That's not what we discussed back in April.
> >
> > No, you mentioned turning aux->linked_prog into a list. However once I
> > started looking at it I figured it was better to actually have all this
> > (the trampoline and ref) as part of the bpf_link structure, since
> > logically they're related.
> >
> > But as you pointed out, the original reference sticks. So either that
> > needs to be removed, or I need to go back to the 'aux->linked_progs as a
> > list' idea. Any preference?
>
> Good question. Back then I was thinking about converting linked_prog into link
> list, since standalone single linked_prog is quite odd, because attaching ext
> prog to multiple tgt progs should have equivalent properties across all
> attachments.
> Back then bpf_link wasn't quite developed.
> Now I feel moving into bpf_tracing_link is better.
> I guess a link list of bpf_tracing_link-s from 'struct bpf_prog' might work.
> At prog load time we can do bpf_link_init() only (without doing bpf_link_prime)
> and keep this pre-populated bpf_link with target bpf prog and trampoline
> in a link list accessed from 'struct bpf_prog'.
> Then bpf_tracing_prog_attach() without extra tgt_prog_fd/btf_id would complete
> that bpf_tracing_link by calling bpf_link_prime() and bpf_link_settle()
> without allocating new one.
> Something like:
> struct bpf_tracing_link {
>         struct bpf_link link;  /* ext prog pointer is hidding in there */
>         enum bpf_attach_type attach_type;
>         struct bpf_trampoline *tr;
>         struct bpf_prog *tgt_prog; /* old aux->linked_prog */
> };
>
> ext prog -> aux -> link list of above bpf_tracing_link-s
>
> It's a circular reference, obviously.
> Need to think through the complications and locking.
>
> bpf_tracing_prog_attach() with tgt_prog_fd/btf_id will alloc new bpf_tracing_link
> and will add it to a link list.
>
> Just a rough idea. I wonder what Andrii thinks.
>

I need to spend more time reading existing and new code to see all the
details, but I'll throw a slightly different proposal and let you guys
shoot it down.

So, what if instead of having linked_prog (as bpf_prog *, refcnt'ed),
at BPF_PROG_LOAD time we just record the target prog's ID. BPF
verifier, when doing its target prog checks would attempt to get
bpf_prog * reference; if by that time the target program is gone,
fail, of course. If not, everything proceeds as is, at the end of
verification target_prog is put until attach time.

Then at attach time, we either go with pre-recorded (in
prog->aux->linked_prog_id) target prog's ID or we get a new one from
RAW_TP_OPEN tgt_prog_fd. Either way, we bump refcnt on that target
prog and keep it with bpf_tracing_link (so link on detach would put
target_prog, that way it doesn't go away while EXT prog is attached).
Then do all the compatibility checks, and if everything works out,
bpf_tracing_link gets created, we record trampoline there, etc, etc.
Basically, instead of having an EXT prog holding a reference to the
target prog, only attachment (bpf_link) does that, which conceptually
also seems to make more sense to me. For verification we store prog ID
and don't hold target prog at all.


Now, there will be a problem once you attach EXT prog to a new XDP
root program and release a link against the original XDP root program.
First, I hope I understand the desired sequence right, here's an
example:

1. load XDP root prog X
2. load EXT prog with target prog X
3. attach EXT prog to prog X
4. load XDP root prog Y
5. attach EXT prog to prog Y (Y and X should be "compatible")
6. detach prog X (close bpf_link)

Is that the right sequence?

If yes, then the problem with storing ID of prog X in EXT
prog->aux->linked_prog_id is that you won't be able to re-attach to
new prog Z, because there won't be anything to check compatibility
against (prog X will be long time gone).

So we can do two things here:

1. on attach, replace ext_prog->aux->linked_prog_id with the latest
attached prog (prog Y ID from above example)
2. instead of recording target program FD/ID, capture BTF FD and/or
enough BTF information for checking compatibility.

Approach 2) seems like conceptually the right thing to do (record type
info we care about, not an **instance** of BPF program, compatible
with that type info), but technically might be harder.


That's my thoughts without digging too deep, so sorry if I'm making
some stupid assumptions.



[...]




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux