> On Mon Sep 30, 2024 at 1:49 PM CEST, Lorenzo Bianconi wrote: > > > Lorenzo Bianconi <lorenzo@xxxxxxxxxx> writes: > > > > > > >> > We could combine such a registration API with your header format, so > > > >> > that the registration just becomes a way of allocating one of the keys > > > >> > from 0-63 (and the registry just becomes a global copy of the header). > > > >> > This would basically amount to moving the "service config file" into the > > > >> > kernel, since that seems to be the only common denominator we can rely > > > >> > on between BPF applications (as all attempts to write a common daemon > > > >> > for BPF management have shown). > > > >> > > > >> That sounds reasonable. And I guess we'd have set() check the global > > > >> registry to enforce that the key has been registered beforehand? > > > >> > > > >> > > > > >> > -Toke > > > >> > > > >> Thanks for all the feedback! > > > > > > > > I like this 'fast' KV approach but I guess we should really evaluate its > > > > impact on performances (especially for xdp) since, based on the kfunc calls > > > > order in the ebpf program, we can have one or multiple memmove/memcpy for > > > > each packet, right? > > > > > > Yes, with Arthur's scheme, performance will be ordering dependent. Using > > > a global registry for offsets would sidestep this, but have the > > > synchronisation issues we discussed up-thread. So on balance, I think > > > the memmove() suggestion will probably lead to the least pain. > > > > > > For the HW metadata we could sidestep this by always having a fixed > > > struct for it (but using the same set/get() API with reserved keys). The > > > only drawback of doing that is that we statically reserve a bit of > > > space, but I'm not sure that is such a big issue in practice (at least > > > not until this becomes to popular that the space starts to be contended; > > > but surely 256 bytes ought to be enough for everybody, right? :)). > > > > I am fine with the proposed approach, but I think we need to verify what is the > > impact on performances (in the worst case??) > > If drivers are responsible for populating the hardware metadata before > XDP, we could make sure drivers set the fields in order to avoid any > memove() (and maybe even provide a helper to ensure this?). nope, since the current APIs introduced by Stanislav are consuming NIC metadata in kfuncs (mainly for af_xdp) and, according to my understanding, we want to add a kfunc to store the info for each NIC metadata (e.g rx-hash, timestamping, ..) into the packet (this is what Toke is proposing, right?). In this case kfunc calling order makes a difference. We can think even to add single kfunc to store all the info for all the NIC metadata (maybe via a helping struct) but it seems not scalable to me and we are losing kfunc versatility. Regards, Lorenzo > > > > > Moreover, I still think the metadata area in the xdp_frame/xdp_buff is not > > > > so suitable for nic hw metadata since: > > > > - it grows backward > > > > - it is probably in a different cacheline with respect to xdp_frame > > > > - nic hw metadata will not start at fixed and immutable address, but it depends > > > > on the running ebpf program > > > > > > > > What about having something like: > > > > - fixed hw nic metadata: just after xdp_frame struct (or if you want at the end > > > > of the metadata area :)). Here he can reuse the same KV approach if it is fast > > > > - user defined metadata: in the metadata area of the xdp_frame/xdp_buff > > > > > > AFAIU, none of this will live in the (current) XDP metadata area. It > > > will all live just after the xdp_frame struct (so sharing the space with > > > the metadata area in the sense that adding more metadata kv fields will > > > decrease the amount of space that is usable by the current XDP metadata > > > APIs). > > > > > > -Toke > > > > > > > ah, ok. I was thinking the proposed approach was to put them in the current > > metadata field. > > I've also been thinking of putting this new KV stuff at the start of the > headroom (I think that's what you're saying Toke?). It has a few nice > advantanges: > > * It coexists nicely with the current XDP / TC metadata support. > Those users won't be able to overwrite / corrupt the KV metadata. > KV users won't need to call xdp_adjust_meta() (which would be awkward - > how would they know how much space the KV implementation needs). > > * We don't have to move all the metadata everytime we call > xdp_adjust_head() (or the kernel equivalent). > > Are there any performance implications of that, e.g. for caching? > > This would also grow "upwards" which is more natural, but I think > either way the KV API would hide whether it's downwards or upwards from > users. > > > > > Regards, > > Lorenzo >
Attachment:
signature.asc
Description: PGP signature