Re: [RFC bpf-next] bpf: Create shadow variables for struct_ops in skeletons.

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 26 Jan 2024 09:28:08 -0800

On Thu, Jan 25, 2024 at 6:46 PM Kui-Feng Lee <sinquersw@xxxxxxxxx> wrote:
>
>
>
> On 1/25/24 15:59, Andrii Nakryiko wrote:
> > On Thu, Jan 25, 2024 at 3:13 PM Kui-Feng Lee <sinquersw@xxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 1/25/24 14:21, Andrii Nakryiko wrote:
> >>> On Wed, Jan 24, 2024 at 2:41 PM <thinker.li@xxxxxxxxx> wrote:
> >>>>
> >>>> From: Kui-Feng Lee <thinker.li@xxxxxxxxx>
> >>>>
> >>>> Create shadow variables for the fields of struct_ops maps in a skeleton
> >>>> with the same name as the respective fields. For instance, if struct
> >>>> bpf_testmod_ops has a "data" field as following, you can access or modify
> >>>> its value through a shadow variable also named "data" in the skeleton.
> >>>>
> >>>>     SEC(".struct_ops.link")
> >>>>     struct bpf_testmod_ops testmod_1 = {
> >>>>         .data = 0x1,
> >>>>     };
> >>>>
> >>>> Then, you can change the value in the following manner as shown in the code
> >>>> below.
> >>>>
> >>>>     skel->st_ops_vars.testmod_1.data = 13;
> >>>>
> >>>> It is helpful to configure a struct_ops map during the execution of a
> >>>> program. For instance, you can include a flag in a struct_ops type to
> >>>> modify the way the kernel handles an instance. By implementing this
> >>>> feature, a user space program can alter the flag's value prior to loading
> >>>> an instance into the kernel.
> >>>>
> >>>> The code generator for skeletons will produce code that copies values to
> >>>> shadow variables from the internal data buffer when a skeleton is
> >>>> opened. It will also copy values from shadow variables back to the internal
> >>>> data buffer before a skeleton is loaded into the kernel.
> >>>>
> >>>> The code generator will calculate the offset of a field and generate code
> >>>> that copies the value of the field from or to the shadow variable to or
> >>>> from the struct_ops internal data, with an offset relative to the beginning
> >>>> of the internal data. For instance, if the "data" field in struct
> >>>> bpf_testmod_ops is situated 16 bytes from the beginning of the struct, the
> >>>> address for the "data" field of struct bpf_testmod_ops will be the starting
> >>>> address plus 16.
> >>>>
> >>>> The offset is calculated during code generation, so it is always in line
> >>>> with the internal data in the skeleton. Even if the user space programs and
> >>>> the BPF program were not compiled together, any differences in versions
> >>>> would not impact correctness. This means that even if the BPF program and
> >>>> the user space program reference different versions of the "struct
> >>>> bpf_testmod_ops" and have different offsets for "data", these offsets
> >>>> computed by the code generator would still function correctly.
> >>>>
> >>>> The user space programs can only modify values by using these shadow
> >>>> variables when the skeleton is open, but before being loaded. Once the
> >>>> skeleton is loaded, the value is copied to the kernel, and any future
> >>>> changes only affect the shadow variables in the user space memory and do
> >>>> not update the kernel space. The shadow variables are not initialized
> >>>> before opening a skeleton, so you cannot update values through them before
> >>>> opening.
> >>>>
> >>>> For function pointers (operators), you can change/replace their values with
> >>>> other BPF programs. For example, the test case in test_struct_ops_module.c
> >>>> points .test_2 to test_3() while it was pointed to test_2() by assigning a
> >>>> new value to the shadow variable.
> >>>>
> >>>>     skel->st_ops_vars.testmod_1.test_2 = skel->progs.test_3;
> >>>>
> >>>> However, you need to turn off autoload for test_2() since it is not
> >>>> assigned to any struct_ops map anymore. Or, it will fails to load test_2().
> >>>>
> >>>>    bpf_program__set_autoload(skel->progs.test_2, false);
> >>>>
> >>>> You can define more struct_ops programs than necessary. However, you need
> >>>> to turn autoload off for unused ones.
> >>>
> >>> Overall I like the idea, it seems like a pretty natural and powerful
> >>> interface. Few things I'd do a bit differently:
> >>>
> >>> - less code gen in the skeleton. It feels like it's better to teach
> >>> libbpf to allow getting/setting intial struct_ops map "image" using
> >>> standard bpf_map__initial_value() and bpf_map__set_initial_value().
> >>> That fits with other global data maps.
> >>
> >> Right, it is doable to move some logic from the code generator to
> >> libbpf. The only thing should keep in the code generator should be
> >> generating shadow variable fields in the skeleton.
> >>
> >>>
> >>> - I think all struct ops maps should be in skel->struct_ops.<name>,
> >>> not struct_ops_vars. I'd probably also combine struct_ops.link and
> >>> no-link struct ops in one section for simplicity
> >>
> >> agree!
> >>
> >>>
> >>> - getting underlying struct_ops BTF type should be possible to do
> >>> through bpf_map__btf_value_type_id(), no need for struct_ops-specific
> >>> one
> >>
> >> AFAIK, libbpf doesn't set def.value_type_id for struct_ops maps.
> >> bpf_map__btf_value_type_id() doesn't return the ID of a struct_ops type.
> >> I will check what the side effects are if def.value_type_id is set for
> >> struct_ops maps.
> >
> > Yes, it doesn't right now, not sure why, though. At least we can fix
> > that on libbpf sid
> >
> >>
> >>>
> >>> - you have a bunch of specific logic to dump INT/ENUM/PTR, I wonder if
> >>> we can just reuse libbpf's BTF dumping API to dump everything? Though
> >>> for prog pointers we'd want to rewrite them to `struct bpf_program *`
> >>> field, not sure yet how hard it is.
> >>
> >> I am not quite sure what part you are talking about.
> >> Do you mean the code skipping type modifiers?
> >> The implementation skips all const (and static/volatile/... etc) to make
> >> sure the user space programs can change these values without any
> >> tricky type casting.
> >>
> >
> > No, I'm talking about gen_st_ops_shadow_vars and gen_st_ops_func_one,
> > which don't handle members of type STRUCT/UNION, for example. I didn't
> > look too deeply into details of the implementation in those parts, but
> > I don't see any reason why we shouldn't support embedded struct
> > members there?
>
>
> One of goals here is to make sure the result doesn't affect by the
> change of types between versions. So, even a BPF program and a user
> space program are compiled separated with different versions of a type,
> they should still work together if the skeleton doesn't miss any
> field required by the user space program.

I don't see a problem here. There is some local type information that
was used to compile BPF object file. Libbpf knows it, it's in BTF.
This is the type user knows about and fills out. Then during BPF
object load libbpf will translate it to whatever kernel actually
expects. I think it all works out fine and not really a concern. We
just stick (consistently) to what was compiled into .bpf.o's BTF
information.

>
> For fields of struct or
> union types, that means we also have to include definitions of all
> these types in the header files of skeletons.  It may also raise
> type conflicts if we don't rename these types properly.

That's true, and is basically similar to the problem we have with
global variables and their types. The only difference is that these
struct_ops types are generally not controlled by users, and rather
come from kernel (vmlinux.h and such), so you are right, this might
cause some problems. Ok, we can start simple and skip those for now, I
guess.

>
> >
> >>>
> >>> The above is brief, so please ask questions where it's not clear what
> >>> I propose, thanks!
> >>>
> >>> [...]
> >>
> >> Thank you for the comments.
> >>