On Thu, Mar 30, 2023 at 7:49 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Wed, 2023-03-29 at 22:38 -0700, Alexei Starovoitov wrote: > > On Wed, Mar 29, 2023 at 11:36 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > On Mon, 2023-03-27 at 11:52 -0700, Andrii Nakryiko wrote: > > > > SEC("freplace") (i.e., BPF_PROG_TYPE_EXT) programs are not loadable as > > > > is through veristat, as kernel expects actual program's FD during > > > > BPF_PROG_LOAD time, which veristat has no way of knowing. > > > > > > > > Unfortunately, freplace programs are a pretty important class of > > > > programs, especially when dealing with XDP chaining solutions, which > > > > rely on EXT programs. > > > > > > > > So let's do our best and teach veristat to try to guess the original > > > > program type, based on program's context argument type. And if guessing > > > > process succeeds, we manually override freplace/EXT with guessed program > > > > type using bpf_program__set_type() setter to increase chances of proper > > > > BPF verification. > > > > > > > > We rely on BTF and maintain a simple lookup table. This process is > > > > obviously not 100% bulletproof, as valid program might not use context > > > > and thus wouldn't have to specify correct type. Also, __sk_buff is very > > > > ambiguous and is the context type across many different program types. > > > > We pick BPF_PROG_TYPE_CGROUP_SKB for now, which seems to work fine in > > > > practice so far. Similarly, some program types require specifying attach > > > > type, and so we pick one out of possible few variants. > > > > > > > > Best effort at its best. But this makes veristat even more widely > > > > applicable. > > > > > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > > I left one nitpick below, otherwise looks good. > > > > > > I tried in on freplace programs from selftests and only 3 out 18 > > > programs verified correctly, others complained about unavailable > > > functions or exit code not in range [0, 1], etc. > > > Not sure, if it's possible to select more permissive attachment kinds, though. > > > > > > Tested-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > > > Thanks for testing and important feedback. > > I've applied the set. The nits can be addressed in the follow up. > > > > What do you have in mind as 'more permissive attach' ? > > What are those 15 out of 18 with invalid exit code? > > What kind of attach_type will help? > > TLDR: I apologize, it was a bad comment. > Should have done more analysis yesterday and withheld from commenting. > > --- > > Inspected each program and it turned out that most of the failing ones > are either not programs but separate functions, or have __skb_buf parameter. > The summary table is at the end of the email, here is the list of those > that should load but fail: > > - Have __skb_buf parameter and attach to SEC("tc") > - fexit_bpf2bpf.bpf.o:new_get_skb_len > - freplace_cls_redirect.bpf.o:freplace_cls_redirect_test > - freplace_global_func.bpf.o:new_test_pkt_access > - Need ifindex to be specified prior loading: > - xdp_metadata2.bpf.o:freplace_rx Thanks for compiling the table! I went through it and left comments. skbuff programs do work now, but xdp_metadata2 is harder case, I don't think it will work, as it requires driver support. > Should > File Program Verdict fail? Reason > ------------------------------ --------------------------------- ------- ------ ----------------------- > fexit_bpf2bpf.bpf.o new_get_constant failure no Function, not a program not much that can be done here, it's just `long` argument... > fexit_bpf2bpf.bpf.o new_get_skb_ifindex failure no Function, not a program Yeah, this heuristic of substituting the original program type won't work when subprog uses more than one argument and/or if the first argument is not the context type argument, unfortunately. > fexit_bpf2bpf.bpf.o new_get_skb_len failure no __sk_buff parameter Right.. I'll set SCHED_CLS as a better guess for __sk_buff. It doesn't seem to regress anything for Meta-specific programs I tried it on.cI'll switch it over, unless others have opinions about SCHED_CLS vs CGROUP_SKB. I'm not well versed in all the __sk_buff-using program type differences, unfortunately. This one now works with SCHED_CLS now. > fexit_bpf2bpf.bpf.o new_test_pkt_write_access_subprog failure no Function, not a program yep, subprog, can't work > fexit_bpf2bpf.bpf.o test_main failure no Function, not a program > fexit_bpf2bpf.bpf.o test_subprog1 failure no Function, not a program > fexit_bpf2bpf.bpf.o test_subprog2 failure no Function, not a program > fexit_bpf2bpf.bpf.o test_subprog3 failure no Function, not a program These are actually fexit programs. Currently they are assumed to be fexit of kernel function. These programs are using BPF_PROG() macro, they really have a u64[] context type. Maybe there is something to be done here to improve the situation, but I'm not yet planning to look into this. > freplace_attach_probe.bpf.o new_handle_kprobe failure yes guessing kprobe correctly, yep > freplace_cls_redirect.bpf.o freplace_cls_redirect_test failure no __sk_buff parameter Now succeeds with SCHED_CLS. > freplace_connect4.bpf.o new_do_bind failure yes > freplace_connect_v4_prog.bpf.o new_connect_v4_prog failure yes yep, guessed correctly CGROUP_SOCK_ADDR, but designed to fail > freplace_get_constant.bpf.o security_new_get_constant failure no Function, not a program same as new_get_constant, I don't think we can do better > freplace_global_func.bpf.o new_test_pkt_access failure no __sk_buff parameter Now works. > freplace_progmap.bpf.o xdp_cpumap_prog success > freplace_progmap.bpf.o xdp_drop_prog success > test_trace_ext.bpf.o test_pkt_md_access_new success Cool, handled. > xdp_metadata2.bpf.o freplace_rx failure no needs ifindex this is a new fancy bpf_dev_bound_kfunc_check() stuff, I don't think it can be easily solved (I tried substituting IFINDEX_LO and that didn't work, of course). > ------------------------------ --------------------------------- ------- ------- -----------------------