On 2019/06/07 4:41, Jesper Dangaard Brouer wrote:
On Thu, 6 Jun 2019 20:04:20 +0900 Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> wrote:On 2019/06/05 16:59, Jesper Dangaard Brouer wrote:On Wed, 5 Jun 2019 14:36:12 +0900 Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> wrote:This is introduced for admins to check what is happening on XDP_TX when bulk XDP_TX is in use, which will be first introduced in veth in next commit.Is the plan that this tracepoint 'xdp:xdp_bulk_tx' should be used by all drivers?I guess you mean all drivers that implement similar mechanism should use this? Then yes. (I don't think all drivers needs bulk tx mechanism though)(more below)Signed-off-by: Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> --- include/trace/events/xdp.h | 25 +++++++++++++++++++++++++ kernel/bpf/core.c | 1 + 2 files changed, 26 insertions(+) diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index e95cb86..e06ea65 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -50,6 +50,31 @@ __entry->ifindex) );+TRACE_EVENT(xdp_bulk_tx,+ + TP_PROTO(const struct net_device *dev, + int sent, int drops, int err), + + TP_ARGS(dev, sent, drops, err), + + TP_STRUCT__entry(All other tracepoints in this file starts with: __field(int, prog_id) __field(u32, act) or __field(int, map_id) __field(u32, act) Could you please add those?So... prog_id is the problem. The program can be changed while we are enqueueing packets to the bulk queue, so the prog_id at flush may be an unexpected one.Hmmm... that sounds problematic, if the XDP bpf_prog for veth can change underneath, before the flush. Our redirect system, depend on things being stable until the xdp_do_flush_map() operation, as will e.g. set per-CPU (bpf_redirect_info) map_to_flush pointer (which depend on XDP prog), and expect it to be correct/valid.
Sorry, I don't get how maps depend on programs.At least xdp_do_redirect_map() handles map_to_flush change during NAPI. Is there a problem when the map is not changed but the program is changed? Also I believe this is not veth-specific behavior. Looking at tun and i40e, they seem to change xdp_prog without stopping data path.
It can be fixed by disabling NAPI when changing XDP programs. This stops packet processing while changing XDP programs, but I guess it is an acceptable compromise. Having said that, I'm honestly not so eager to make this change, since this will require refurbishment of one of the most delicate part of veth XDP, NAPI disabling/enabling mechanism. WDYT?Sound like a bug, if XDP bpf_prog is not stable within the NAPI poll...+ __field(int, ifindex) + __field(int, drops) + __field(int, sent) + __field(int, err) + ),The reason is that this make is easier to attach to multiple tracepoints, and extract the same value. Example with bpftrace oneliner: $ sudo bpftrace -e 'tracepoint:xdp:xdp_* { @action[args->act] = count(); }'