On Fri, Jun 23, 2023 at 3:24 AM Jesper Dangaard Brouer <jbrouer@xxxxxxxxxx> wrote: > > > > On 22/06/2023 19.55, Stanislav Fomichev wrote: > > On Thu, Jun 22, 2023 at 2:11 AM Jesper D. Brouer <netdev@xxxxxxxxxx> wrote: > >> > >> > >> This needs to be reviewed by AF_XDP maintainers Magnus and Bjørn (Cc) > >> > >> On 21/06/2023 19.02, Stanislav Fomichev wrote: > >>> For zerocopy mode, tx_desc->addr can point to the arbitrary offset > >>> and carry some TX metadata in the headroom. For copy mode, there > >>> is no way currently to populate skb metadata. > >>> > >>> Introduce new XDP_TX_METADATA_LEN that indicates how many bytes > >>> to treat as metadata. Metadata bytes come prior to tx_desc address > >>> (same as in RX case). > >> > >> From looking at the code, this introduces a socket option for this TX > >> metadata length (tx_metadata_len). > >> This implies the same fixed TX metadata size is used for all packets. > >> Maybe describe this in patch desc. > > > > I was planning to do a proper documentation page once we settle on all > > the details (similar to the one we have for rx). > > > >> What is the plan for dealing with cases that doesn't populate same/full > >> TX metadata size ? > > > > Do we need to support that? I was assuming that the TX layout would be > > fixed between the userspace and BPF. > > I hope you don't mean fixed layout, as the whole point is adding > flexibility and extensibility. I do mean a fixed layout between the userspace (af_xdp) and devtx program. At least fixed max size of the metadata. The userspace and the bpf prog can then use this fixed space to implement some flexibility (btf_ids, versioned structs, bitmasks, tlv, etc). If we were to make the metalen vary per packet, we'd have to signal its size per packet. Probably not worth it? > > If every packet would have a different metadata length, it seems like > > a nightmare to parse? > > > > No parsing is really needed. We can simply use BTF IDs and type cast in > BPF-prog. Both BPF-prog and userspace have access to the local BTF ids, > see [1] and [2]. > > It seems we are talking slightly past each-other(?). Let me rephrase > and reframe the question, what is your *plan* for dealing with different > *types* of TX metadata. The different struct *types* will of-cause have > different sizes, but that is okay as long as they fit into the maximum > size set by this new socket option XDP_TX_METADATA_LEN. > Thus, in principle I'm fine with XSK having configured a fixed headroom > for metadata, but we need a plan for handling more than one type and > perhaps a xsk desc indicator/flag for knowing TX metadata isn't random > data ("leftover" since last time this mem was used). Yeah, I think the above correctly catches my expectation here. Some headroom is reserved via XDP_TX_METADATA_LEN and the flexibility is offloaded to the bpf program via btf_id/tlv/etc. Regarding leftover metadata: can we assume the userspace will take care of setting it up? > With this kfunc approach, then things in-principle, becomes a contract > between the "local" TX-hook BPF-prog and AF_XDP userspace. These two > components can as illustrated here [1]+[2] can coordinate based on local > BPF-prog BTF IDs. This approach works as-is today, but patchset > selftests examples don't use this and instead have a very static > approach (that people will copy-paste). > > An unsolved problem with TX-hook is that it can also get packets from > XDP_REDIRECT and even normal SKBs gets processed (right?). How does the > BPF-prog know if metadata is valid and intended to be used for e.g. > requesting the timestamp? (imagine metadata size happen to match) My assumption was the bpf program can do ifindex/netns filtering. Plus maybe check that the meta_len is the one that's expected. Will that be enough to handle XDP_REDIRECT? > --Jesper > > BPF-prog API bpf_core_type_id_local: > - [1] > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/af_xdp_kern.c#L80 > > Userspace API btf__find_by_name_kind: > - [2] > https://github.com/xdp-project/bpf-examples/blob/master/AF_XDP-interaction/lib_xsk_extend.c#L185 >