On 14.03.22 21:39, Jesper D. Brouer wrote:
(Cc. BPF list and other XDP maintainers)
On 14/03/2022 11.22, Felix Fietkau wrote:
Most ethernet drivers allocate a packet headroom of NET_SKB_PAD. Since it is
rounded up to L1 cache size, it ends up being at least 64 bytes on the most
common platforms.
On most ethernet drivers, having a guaranteed headroom of 256 bytes for XDP
adds an extra forced pskb_expand_head call when enabling SKB XDP, which can
be quite expensive.
Many XDP programs need only very little headroom, so it can be beneficial
to have a way to opt-out of the 256 bytes headroom requirement.
IMHO 64 bytes is too small.
We are using this area for struct xdp_frame and also for metadata
(XDP-hints). This will limit us from growing this structures for
the sake of generic-XDP.
I'm fine with reducting this to 192 bytes, as most Intel drivers
have this headroom, and have defacto established that this is
a valid XDP headroom, even for native-XDP.
We could go a small as two cachelines 128 bytes, as if xdp_frame
and metadata grows above a cache-line (64 bytes) each, then we have
done something wrong (performance wise).
Here's some background on why I chose 64 bytes: I'm currently
implementing a userspace + xdp program to act as generic fastpath to
speed network bridging.
For that I need to support a very diverse set of network drivers
(including a lot of WLAN drivers).
My XDP program only needs up to 4 bytes extra headroom (for a VLAN header).
I made this headroom reduction opt-in, so that by default generic-XDP
programs can still rely on 256 bytes headroom.
If we make the small version any bigger than 64 bytes, it limits my
options to:
1) bump NET_SKB_PAD accordingly at the risk of creating issues with
buffer management for some affected drivers (IMHO not likely to be
accepted upstream)
2) create patches for each and every driver that could possibly get used
on OpenWrt to make my approach viable (there's so many of them, so I
think that's not really feasible either)
3) stick with non-upstream hacks for dealing with this in OpenWrt
I don't really like any of those options, but I can't think of any other
solution right now.
If I take the pskb_expand_head hit from the headroom mismatch, the
result is that my bridge accelerator code actually decreases performance
instead of making anything better.
- Felix