On Fri, 29 May 2020 18:39:40 +0200 Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > Jesper Dangaard Brouer <brouer@xxxxxxxxxx> writes: > > > The devmap map-value can be read from BPF-prog side, and could be used for a > > storage area per device. This could e.g. contain info on headers that need > > to be added when packet egress this device. > > > > This patchset adds a dynamic storage member to struct bpf_devmap_val. More > > importantly the struct bpf_devmap_val is made dynamic via leveraging and > > requiring BTF for struct sizes above 4. The only mandatory struct member is > > 'ifindex' with a fixed offset of zero. > > > > Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> > > --- > > kernel/bpf/devmap.c | 216 ++++++++++++++++++++++++++++++++++++++++++++------- > > 1 file changed, 185 insertions(+), 31 deletions(-) > > > > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > > index 4ab67b2d8159..9cf2dadcc0fe 100644 [...] > > @@ -60,13 +61,30 @@ struct xdp_dev_bulk_queue { > > unsigned int count; > > }; > > > > -/* DEVMAP values */ > > +/* DEVMAP map-value layout. > > + * > > + * The struct data-layout of map-value is a configuration interface. > > + * BPF-prog side have read-only access to this memory. > > + * > > + * The layout might be different than below, because some struct members are > > + * optional. This is made dynamic by requiring userspace provides an BTF > > + * description of the struct layout, when creating the BPF-map. Struct names > > + * are important and part of API, as BTF use these names to identify members. > > + */ > > struct bpf_devmap_val { > > - __u32 ifindex; /* device index */ > > + __u32 ifindex; /* device index - mandatory */ > > union { > > int fd; /* prog fd on map write */ > > __u32 id; /* prog id on map read */ > > } bpf_prog; > > + struct { > > + /* This 'storage' member is meant as a dynamically sized area, > > + * that BPF developer can redefine. As other members are added > > + * overtime, this area can shrink, as size can be regained by > > + * not using members above. Add new members above this struct. > > + */ > > + unsigned char data[24]; > > + } storage; > > Why is this needed? Userspace already passes in the value_size, so why > can't the kernel just use the BTF to pick out the values it cares about > and let the rest be up to userspace? The kernel cannot just ignore unknown struct members, due to forward compatibility. An older kernel that sees a new struct member, cannot know what this struct member is used for. Thus, later I'm rejecting map creation if I detect members kernel doesn't know about. This means, that I need to create a named area (e.g. named 'storage') that users can define their own layout within. This might be difficult to comprehend for other kernel developers, because usually we create forward compatibility via walking the binary struct and then assume that if an unknown area (in end-of-struct) contains zeros, then it means end-user isn't using that unknown feature. This doesn't work when the default value, as in this exact case, need to be minus-1 do describe "unused" as this is a file descriptor. Forward compatibility is different here. If the end-user include the member in their BTF description, that means they intend to use it. Thus, kernel need to reject map-create if it sees unknown members. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer