Re: [PATCH v3 bpf-next 06/13] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 22, 2021 at 4:43 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Nov 22, 2021 at 3:47 PM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> >
> > On Fri, Nov 19, 2021 at 7:33 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > From: Alexei Starovoitov <ast@xxxxxxxxxx>
> > >
> > > Given BPF program's BTF perform a linear search through kernel BTFs for
> > > a possible candidate.
> > > Then wire the result into bpf_core_apply_relo_insn().
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
> > > ---
> > >  kernel/bpf/btf.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 135 insertions(+), 1 deletion(-)
> > >
> >
> > [...]
> >
> > >  int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
> > >                    int relo_idx, void *insn)
> > >  {
> > > -       return -EOPNOTSUPP;
> > > +       if (relo->kind != BPF_CORE_TYPE_ID_LOCAL) {
> > > +               struct bpf_core_cand_list *cands;
> > > +
> > > +               cands = bpf_core_find_cands(ctx, relo->type_id);
> >
> > this is wrong for many reasons:
> >
> > 1. you will overwrite previous ctx->cands, if it was already set,
> > which leaks memory
> > 2. this list of candidates should be keyed by relo->type_id ("root
> > type"). Different root types get their own independent lists; so it
> > has to be some sort of look up table from type_id to a list of
> > candidates.
> >
> > 2) means that if you had a bunch of relos against struct task_struct,
> > you'll crate a list of candidates when processing first relo that
> > starts at task_struct. All the subsequent relos that have task_struct
> > as root type will re-used that list and potentially trim it down. If
> > there are some other relos against, say, struct mm_struct, they will
> > have their independent list of candidates.
>
> right.
> Your prior comment confused me. I didn't do this reuse of cands
> to avoid introducing hashtable here like libbpf does,
> since it does too little to actually help.

Since when avoiding millions of iterations for each relocation is "too
little help"?

BTW, I've tried to measure how noticeable this will be and added
test_verif_scale2 to core_kern with only change to use vmlinux.h (so
that __sk_buff accesses are CO-RE relocated). I didn't manage to get
it loaded, so something else is going there. So please take a look, I
don't really know how to debug lskel stuff. Here are the changes:

diff --git a/tools/testing/selftests/bpf/progs/core_kern.c
b/tools/testing/selftests/bpf/progs/core_kern.c
index 3b4571d68369..9916cf059883 100644
--- a/tools/testing/selftests/bpf/progs/core_kern.c
+++ b/tools/testing/selftests/bpf/progs/core_kern.c
@@ -4,6 +4,10 @@

 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+
+#define ATTR __always_inline
+#include "test_jhash.h"

 struct {
  __uint(type, BPF_MAP_TYPE_ARRAY);
@@ -57,4 +61,27 @@ int BPF_PROG(fexit_eth_type_trans, struct sk_buff *skb,
  return randmap(dev->ifindex + skb->len);
 }

+SEC("tc")
+int balancer_ingress(struct __sk_buff *ctx)
+{
+ void *data_end = (void *)(long)ctx->data_end;
+ void *data = (void *)(long)ctx->data;
+ void *ptr;
+ int ret = 0, nh_off, i = 0;
+
+ nh_off = 14;
+
+ /* pragma unroll doesn't work on large loops */
+
+#define C do { \
+ ptr = data + i; \
+ if (ptr + nh_off > data_end) \
+ break; \
+ ctx->tc_index = jhash(ptr, nh_off, ctx->cb[0] + i++); \
+ } while (0);
+#define C30 C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;
+ C30;C30;C30; /* 90 calls */
+ return 0;
+}
+
 char LICENSE[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/test_verif_scale2.c
b/tools/testing/selftests/bpf/progs/test_verif_scale2.c
index f024154c7be7..9e2c2a6954cb 100644
--- a/tools/testing/selftests/bpf/progs/test_verif_scale2.c
+++ b/tools/testing/selftests/bpf/progs/test_verif_scale2.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2019 Facebook
-#include <linux/bpf.h>
+#include "vmlinux.h"
 #include <bpf/bpf_helpers.h>
 #define ATTR __always_inline
 #include "test_jhash.h"


The version with libbpf does load successfully, while lskel one gives:

#35 core_kern_lskel:FAIL
test_core_kern_lskel:FAIL:open_and_load unexpected error: -22

Same stuff with libbpf skeleton successfully loaded.

If you manage to get it working, you'll have a BPF program 181 CO-RE
relocations, let's see how noticeable it will be to do 181 iterations
over ~2mln vmlinux BTF types.

> I think I will go back to the prior version: linear search for every relo.

I don't understand the resistance, the kernel has both rbtree and
hashmaps, what's the problem just using that?

> If we actually need to optimize this part of loading
> we better do persistent cache of
> name -> kernel btf_type-s
> and reuse it across different programs.

You can't reuse it because the same type name in a BPF object BTF can
be resolved to different kernel types (e.g., if we still had
ring_buffer name collision), so this is all per-BPF object (or at
least per BPF program). We still have duplicated types in some more
fuller kernel configurations, and it can always happen down the road.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux