Currently we observed a significant performance degradation in samples/bpf xdp1 and xdp2, due XDP multibuffer "xdp.frags" handling, added in commit 772251742262 ("samples/bpf: fixup some tools to be able to support xdp multibuffer"). This patch reduce the overhead by avoiding to read/load shared_info (sinfo) memory area, when XDP packet don't have any frags. This improves performance because sinfo is located in another cacheline. Function bpf_xdp_pointer() is used by BPF helpers bpf_xdp_load_bytes() and bpf_xdp_store_bytes(). As a help to reviewers, xdp_get_buff_len() can potentially access sinfo, but it uses xdp_buff_has_frags() flags bit check to avoid accessing sinfo in no-frags case. The likely/unlikely instrumentation lays out asm code such that sinfo access isn't interleaved with no-frags case (checked on GCC 12.2.1-4). The generated asm code is more compact towards the no-frags case. The BPF kfunc bpf_dynptr_slice() also use bpf_xdp_pointer(). Thus, it should also take effect for that. Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> --- net/core/filter.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 968139f4a1ac..961db5bd2f94 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3948,20 +3948,21 @@ void bpf_xdp_copy_buf(struct xdp_buff *xdp, unsigned long off, void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len) { - struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); u32 size = xdp->data_end - xdp->data; + struct skb_shared_info *sinfo; void *addr = xdp->data; int i; if (unlikely(offset > 0xffff || len > 0xffff)) return ERR_PTR(-EFAULT); - if (offset + len > xdp_get_buff_len(xdp)) + if (unlikely(offset + len > xdp_get_buff_len(xdp))) return ERR_PTR(-EINVAL); - if (offset < size) /* linear area */ + if (likely((offset < size))) /* linear area */ goto out; + sinfo = xdp_get_shared_info_from_buff(xdp); offset -= size; for (i = 0; i < sinfo->nr_frags; i++) { /* paged area */ u32 frag_size = skb_frag_size(&sinfo->frags[i]);