Re: [PATCH v2 bpf-next 1/2] xdp: Add tracepoint for bulk XDP_TX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/06/05 16:59, Jesper Dangaard Brouer wrote:
On Wed,  5 Jun 2019 14:36:12 +0900
Toshiaki Makita <toshiaki.makita1@xxxxxxxxx> wrote:

This is introduced for admins to check what is happening on XDP_TX when
bulk XDP_TX is in use, which will be first introduced in veth in next
commit.

Is the plan that this tracepoint 'xdp:xdp_bulk_tx' should be used by
all drivers?

I guess you mean all drivers that implement similar mechanism should use this? Then yes.
(I don't think all drivers needs bulk tx mechanism though)

(more below)

Signed-off-by: Toshiaki Makita <toshiaki.makita1@xxxxxxxxx>
---
  include/trace/events/xdp.h | 25 +++++++++++++++++++++++++
  kernel/bpf/core.c          |  1 +
  2 files changed, 26 insertions(+)

diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index e95cb86..e06ea65 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -50,6 +50,31 @@
  		  __entry->ifindex)
  );
+TRACE_EVENT(xdp_bulk_tx,
+
+	TP_PROTO(const struct net_device *dev,
+		 int sent, int drops, int err),
+
+	TP_ARGS(dev, sent, drops, err),
+
+	TP_STRUCT__entry(

All other tracepoints in this file starts with:

		__field(int, prog_id)
		__field(u32, act)
or
		__field(int, map_id)
		__field(u32, act)

Could you please add those?

So... prog_id is the problem. The program can be changed while we are enqueueing packets to the bulk queue, so the prog_id at flush may be an unexpected one.

It can be fixed by disabling NAPI when changing XDP programs. This stops packet processing while changing XDP programs, but I guess it is an acceptable compromise. Having said that, I'm honestly not so eager to make this change, since this will require refurbishment of one of the most delicate part of veth XDP, NAPI disabling/enabling mechanism.

WDYT?

+		__field(int, ifindex)
+		__field(int, drops)
+		__field(int, sent)
+		__field(int, err)
+	),

The reason is that this make is easier to attach to multiple
tracepoints, and extract the same value.

Example with bpftrace oneliner:

$ sudo bpftrace -e 'tracepoint:xdp:xdp_* { @action[args->act] = count(); }'
Attaching 8 probes...
^C

@action[4]: 30259246
@action[0]: 34489024

XDP_ABORTED = 0 	
XDP_REDIRECT= 4


+
+	TP_fast_assign(

		__entry->act		= XDP_TX;

OK


+		__entry->ifindex	= dev->ifindex;
+		__entry->drops		= drops;
+		__entry->sent		= sent;
+		__entry->err		= err;
+	),
+
+	TP_printk("ifindex=%d sent=%d drops=%d err=%d",
+		  __entry->ifindex, __entry->sent, __entry->drops, __entry->err)
+);
+

Other fun bpftrace stuff:

sudo bpftrace -e 'tracepoint:xdp:xdp_*map* { @map_id[comm, args->map_id] = count(); }'
Attaching 5 probes...
^C

@map_id[swapper/2, 113]: 1428
@map_id[swapper/0, 113]: 2085
@map_id[ksoftirqd/4, 113]: 2253491
@map_id[ksoftirqd/2, 113]: 25677560
@map_id[ksoftirqd/0, 113]: 29004338
@map_id[ksoftirqd/3, 113]: 31034885


$ bpftool map list id 113
113: devmap  name tx_port  flags 0x0
	key 4B  value 4B  max_entries 100  memlock 4096B


p.s. People should look out for Brendan Gregg's upcoming book on BPF
performance tools, from which I learned to use bpftrace :-)

Where can I get information on the book?

--
Toshiaki Makita



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux