Re: [PATCH v4 bpf-next 3/3] veristat: guess and substitue underlying program type for freplace (EXT) progs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 30, 2023 at 7:49 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Wed, 2023-03-29 at 22:38 -0700, Alexei Starovoitov wrote:
> > On Wed, Mar 29, 2023 at 11:36 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > >
> > > On Mon, 2023-03-27 at 11:52 -0700, Andrii Nakryiko wrote:
> > > > SEC("freplace") (i.e., BPF_PROG_TYPE_EXT) programs are not loadable as
> > > > is through veristat, as kernel expects actual program's FD during
> > > > BPF_PROG_LOAD time, which veristat has no way of knowing.
> > > >
> > > > Unfortunately, freplace programs are a pretty important class of
> > > > programs, especially when dealing with XDP chaining solutions, which
> > > > rely on EXT programs.
> > > >
> > > > So let's do our best and teach veristat to try to guess the original
> > > > program type, based on program's context argument type. And if guessing
> > > > process succeeds, we manually override freplace/EXT with guessed program
> > > > type using bpf_program__set_type() setter to increase chances of proper
> > > > BPF verification.
> > > >
> > > > We rely on BTF and maintain a simple lookup table. This process is
> > > > obviously not 100% bulletproof, as valid program might not use context
> > > > and thus wouldn't have to specify correct type. Also, __sk_buff is very
> > > > ambiguous and is the context type across many different program types.
> > > > We pick BPF_PROG_TYPE_CGROUP_SKB for now, which seems to work fine in
> > > > practice so far. Similarly, some program types require specifying attach
> > > > type, and so we pick one out of possible few variants.
> > > >
> > > > Best effort at its best. But this makes veristat even more widely
> > > > applicable.
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > >
> > > I left one nitpick below, otherwise looks good.
> > >
> > > I tried in on freplace programs from selftests and only 3 out 18
> > > programs verified correctly, others complained about unavailable
> > > functions or exit code not in range [0, 1], etc.
> > > Not sure, if it's possible to select more permissive attachment kinds, though.
> > >
> > > Tested-by: Eduard Zingerman <eddyz87@xxxxxxxxx>
> >
> > Thanks for testing and important feedback.
> > I've applied the set. The nits can be addressed in the follow up.
> >
> > What do you have in mind as 'more permissive attach' ?
> > What are those 15 out of 18 with invalid exit code?
> > What kind of attach_type will help?
>
> TLDR: I apologize, it was a bad comment.
>       Should have done more analysis yesterday and withheld from commenting.
>
> ---
>
> Inspected each program and it turned out that most of the failing ones
> are either not programs but separate functions, or have __skb_buf parameter.
> The summary table is at the end of the email, here is the list of those
> that should load but fail:
>
> - Have __skb_buf parameter and attach to SEC("tc")
>   - fexit_bpf2bpf.bpf.o:new_get_skb_len
>   - freplace_cls_redirect.bpf.o:freplace_cls_redirect_test
>   - freplace_global_func.bpf.o:new_test_pkt_access
> - Need ifindex to be specified prior loading:
>   - xdp_metadata2.bpf.o:freplace_rx

Thanks for compiling the table! I went through it and left comments.
skbuff programs do work now, but xdp_metadata2 is harder case, I don't
think it will work, as it requires driver support.

>                                                                             Should
> File                            Program                            Verdict  fail?   Reason
> ------------------------------  ---------------------------------  -------  ------  -----------------------
> fexit_bpf2bpf.bpf.o             new_get_constant                   failure      no  Function, not a program

not much that can be done here, it's just `long` argument...

> fexit_bpf2bpf.bpf.o             new_get_skb_ifindex                failure      no  Function, not a program

Yeah, this heuristic of substituting the original program type won't
work when subprog uses more than one argument and/or if the first
argument is not the context type argument, unfortunately.


> fexit_bpf2bpf.bpf.o             new_get_skb_len                    failure      no  __sk_buff parameter

Right.. I'll set SCHED_CLS as a better guess for __sk_buff. It doesn't
seem to regress anything for Meta-specific programs I tried it
on.cI'll switch it over, unless others have opinions about SCHED_CLS
vs CGROUP_SKB. I'm not well versed in all the __sk_buff-using program
type differences, unfortunately.

This one now works with SCHED_CLS now.


> fexit_bpf2bpf.bpf.o             new_test_pkt_write_access_subprog  failure      no  Function, not a program

yep, subprog, can't work

> fexit_bpf2bpf.bpf.o             test_main                          failure      no  Function, not a program
> fexit_bpf2bpf.bpf.o             test_subprog1                      failure      no  Function, not a program
> fexit_bpf2bpf.bpf.o             test_subprog2                      failure      no  Function, not a program
> fexit_bpf2bpf.bpf.o             test_subprog3                      failure      no  Function, not a program

These are actually fexit programs. Currently they are assumed to be
fexit of kernel function. These programs are using BPF_PROG() macro,
they really have a u64[] context type. Maybe there is something to be
done here to improve the situation, but I'm not yet planning to look
into this.


> freplace_attach_probe.bpf.o     new_handle_kprobe                  failure     yes

guessing kprobe correctly, yep

> freplace_cls_redirect.bpf.o     freplace_cls_redirect_test         failure      no  __sk_buff parameter

Now succeeds with SCHED_CLS.

> freplace_connect4.bpf.o         new_do_bind                        failure     yes
> freplace_connect_v4_prog.bpf.o  new_connect_v4_prog                failure     yes

yep, guessed correctly CGROUP_SOCK_ADDR, but designed to fail

> freplace_get_constant.bpf.o     security_new_get_constant          failure      no  Function, not a program

same as new_get_constant, I don't think we can do better

> freplace_global_func.bpf.o      new_test_pkt_access                failure      no  __sk_buff parameter

Now works.

> freplace_progmap.bpf.o          xdp_cpumap_prog                    success
> freplace_progmap.bpf.o          xdp_drop_prog                      success
> test_trace_ext.bpf.o            test_pkt_md_access_new             success

Cool, handled.

> xdp_metadata2.bpf.o             freplace_rx                        failure      no  needs ifindex

this is a new fancy bpf_dev_bound_kfunc_check() stuff, I don't think
it can be easily solved (I tried substituting IFINDEX_LO and that
didn't work, of course).

> ------------------------------  ---------------------------------  -------  ------- -----------------------




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux