On Fri, Jan 22, 2021 at 02:38:40PM +0100, Toke Høiland-Jørgensen wrote: > >> out: > >> + drops = cnt - sent; > >> bq->count = 0; > >> > >> trace_xdp_devmap_xmit(bq->dev_rx, dev, sent, drops, err); > >> bq->dev_rx = NULL; > >> + bq->xdp_prog = NULL; > > > > One more question, do you really have to do that per each bq_xmit_all > > call? Couldn't you clear it in __dev_flush ? > > > > Or IOW - what's the rationale behind storing xdp_prog in > > xdp_dev_bulk_queue. Why can't you propagate the dst->xdp_prog and rely on > > that without that local pointer? > > > > You probably have an answer for that, so maybe include it in commit > > message. > > > > BTW same question for clearing dev_rx. To me this will be the same for all > > bq_xmit_all() calls that will happen within same napi. > > I think you're right: When bq_xmit_all() is called from bq_enqueue(), > another packet will always be enqueued immediately after, so clearing > out all of those things in bq_xmit_all() is redundant. This also > includes the list_del on bq->flush_node, BTW. > > And while we're getting into e micro-optimisations: In bq_enqueue() we > have two checks: > > if (!bq->dev_rx) > bq->dev_rx = dev_rx; > > bq->q[bq->count++] = xdpf; > > if (!bq->flush_node.prev) > list_add(&bq->flush_node, flush_list); > > > those two if() checks can be collapsed into one, since the list and the > dev_rx field are only ever modified together. This will also be the case > for bq->xdp_prog, so putting all three under the same check in > bq_enqueue() and only clearing them in __dev_flush() would be a win, I > suppose - nice catch! :) Thanks for the advice, so how about modify it like: diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index bc38f7193149..217e09533097 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -413,9 +413,6 @@ static void bq_xmit_all(struct xdp_dev_bulk_queue *bq, u32 flags) bq->count = 0; trace_xdp_devmap_xmit(bq->dev_rx, dev, sent, drops, err); - bq->dev_rx = NULL; - bq->xdp_prog = NULL; - __list_del_clearprev(&bq->flush_node); return; } @@ -434,8 +431,12 @@ void __dev_flush(void) struct list_head *flush_list = this_cpu_ptr(&dev_flush_list); struct xdp_dev_bulk_queue *bq, *tmp; - list_for_each_entry_safe(bq, tmp, flush_list, flush_node) + list_for_each_entry_safe(bq, tmp, flush_list, flush_node) { bq_xmit_all(bq, XDP_XMIT_FLUSH); + bq->dev_rx = NULL; + bq->xdp_prog = NULL; + __list_del_clearprev(&bq->flush_node); + } } /* rcu_read_lock (from syscall and BPF contexts) ensures that if a delete and/or @@ -469,22 +470,17 @@ static void bq_enqueue(struct net_device *dev, struct xdp_frame *xdpf, /* Ingress dev_rx will be the same for all xdp_frame's in * bulk_queue, because bq stored per-CPU and must be flushed * from net_device drivers NAPI func end. + * + * Do the same with xdp_prog and flush_list since these fields + * are modified together. */ - if (!bq->dev_rx) + if (!bq->dev_rx) { bq->dev_rx = dev_rx; - - /* Store (potential) xdp_prog that run before egress to dev as - * part of bulk_queue. This will be same xdp_prog for all - * xdp_frame's in bulk_queue, because this per-CPU store must - * be flushed from net_device drivers NAPI func end. - */ - if (!bq->xdp_prog) bq->xdp_prog = xdp_prog; + list_add(&bq->flush_node, flush_list); + } bq->q[bq->count++] = xdpf; - - if (!bq->flush_node.prev) - list_add(&bq->flush_node, flush_list); } static inline int __xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp,