Re: [PATCH bpf-next v2 1/4] selftests/bpf: Add traffic monitor functions.

Martin KaFai Lau <martin.lau@xxxxxxxxx> · Wed, 24 Jul 2024 12:08:18 -0700

On 7/23/24 11:24 AM, Kui-Feng Lee wrote:
Add functions that capture packets and print log in the background. They
are supposed to be used for debugging flaky network test cases. A monitored
test case should call traffic_monitor_start() to start a thread to capture
packets in the background for a given namespace and call
traffic_monitor_stop() to stop capturing.

     IPv4 TCP packet: 127.0.0.1:48165 -> 127.0.0.1:36707, len 68, ifindex 1, SYN
     IPv4 TCP packet: 127.0.0.1:36707 -> 127.0.0.1:48165, len 60, ifindex 1, SYN, ACK
     IPv4 TCP packet: 127.0.0.1:48165 -> 127.0.0.1:36707, len 60, ifindex 1, ACK
     IPv4 TCP packet: 127.0.0.1:36707 -> 127.0.0.1:48165, len 52, ifindex 1, ACK
     IPv4 TCP packet: 127.0.0.1:48165 -> 127.0.0.1:36707, len 52, ifindex 1, FIN, ACK
     IPv4 TCP packet: 127.0.0.1:36707 -> 127.0.0.1:48165, len 52, ifindex 1, RST, ACK
     Packet file: packets-2172-86.log
     #280/87 select_reuseport/sockhash IPv4/TCP LOOPBACK test_detach_bpf:OK

The above is the output of an example. It shows the packets of a connection
and the name of the file that contains captured packets in the directory
/tmp/tmon_pcap. The file can be loaded by tcpdump or wireshark.

This feature only works if TRAFFIC_MONITOR variable has been passed to
build BPF selftests. For example,

   make TRAFFIC_MONITOR=1 -C tools/testing/selftests/bpf

If bpf CI does not have libpcap, it is better to get bpf CI ready first/soon.

[ ... ]

+/* Show the information of the transport layer in the packet */
+static void show_transport(const u_char *packet, u16 len, u32 ifindex,
+			   const char *src_addr, const char *dst_addr,
+			   u16 proto, bool ipv6)
+{
+	struct udphdr *udp;
+	struct tcphdr *tcp;
+	u16 src_port, dst_port;
+	const char *transport_str;
+
+	if (proto == IPPROTO_UDP) {
+		udp = (struct udphdr *)packet;
+		src_port = ntohs(udp->source);
+		dst_port = ntohs(udp->dest);
+		transport_str = "UDP";
+	} else if (proto == IPPROTO_TCP) {
+		tcp = (struct tcphdr *)packet;
+		src_port = ntohs(tcp->source);
+		dst_port = ntohs(tcp->dest);
+		transport_str = "TCP"
+;
+	} else {

It will be useful to at least print the ICMP[46] also. Some tests use ping to 
test. For IPv6, printing the ICMPv6 messages will be useful for debugging, e.g. 
the neigh discovery. The icmp type (and code?) should be good enough.

That should be enough to begin with. The pcap dumped file can be used for the rest.

Thanks for switching to libpcap. It is easier to handle the captured packets in 
different ways.

+		printf("%s (proto %d): %s -> %s, ifindex %d\n",
+		       ipv6 ? "IPv6" : "IPv4", proto, src_addr, dst_addr, ifindex);
+		return;
+	}
+
+	if (ipv6)
+		printf("IPv6 %s packet: [%s]:%d -> [%s]:%d, len %d, ifindex %d",

It will be useful to print the ifname also such that easier for human parsing. 
It should be possible by if_indextoname (cheap enough?) if libpcap doesn't have 
it. It could be something for a later followup though. Mostly nit here.

+		       transport_str, src_addr, src_port,
+		       dst_addr, dst_port, len, ifindex);
+	else
+		printf("IPv4 %s packet: %s:%d -> %s:%d, len %d, ifindex %d",
+		       transport_str, src_addr, src_port,
+		       dst_addr, dst_port, len, ifindex);
+
+	if (proto == IPPROTO_TCP) {
+		if (tcp->fin)
+			printf(", FIN");
+		if (tcp->syn)
+			printf(", SYN");
+		if (tcp->rst)
+			printf(", RST");
+		if (tcp->ack)
+			printf(", ACK");
+	}
+
+	printf("\n");
+}

[ ... ]

+/* Start to monitor the network traffic in the given network namespace.
+ *
+ * netns: the name of the network namespace to monitor. If NULL, the
+ * current network namespace is monitored.
+ *
+ * This function will start a thread to capture packets going through NICs
+ * in the give network namespace.
+ */
+struct tmonitor_ctx *traffic_monitor_start(const char *netns)

There is opportunity to make the traffic monitoring easier for tests that create 
its own netns which I hope most of the networking tests fall into this bucket 
now. Especially for tests that create multiple netns such that the test does not 
have to start/stop for each individual netns.

May be adding an API like "struct nstoken *netns_new(const char *netns_name)". 
The netns_new() will create the netns and (optionally) start the monitoring 
thread also. It will need another "void netns_free(struct nstoken *nstoken)" to 
stop the thread and remove the netns. The "struct tmonitor_ctx" probably makes 
sense to be embedded into "struct nstoken" if we go with this new API.

This will need some changes to the tests creating netns but it probably should 
be obvious change considering most test do "ip netns add..." and then 
open_netns(). It can start with the flaky test at hand first like tc_redirect.

May be a little more changes for the test using "unshare(CLONE_NEWNET)" but 
should not be too bad either. This can be done only when we need to turn on 
libpcap to debug that test.

Also, when the test is flaky, make it easier for people not familiar with the 
codes of the networking test to turn on traffic monitoring without changing the 
test code. May be in a libpcap.list file (in parallel to the existing DENYLIST)?

For the tests without having its own netns, they can either move to netns (which 
I think it is a good thing to do) or use the traffic_monitor_start/stop() 
manually by changing the testing code,
or a better way is to ask test_progs do it for the host netns (init_netns) 
automatically for all tests in the libpcap.list.

wdyt?

+{
+	struct tmonitor_ctx *ctx = NULL;
+	struct nstoken *nstoken = NULL;
+	int pipefd[2] = {-1, -1};
+	static int tmon_seq;
+	int r;
+
+	if (netns) {
+		nstoken = open_netns(netns);
+		if (!nstoken)
+			return NULL;
+	}
+	ctx = malloc(sizeof(*ctx));
+	if (!ctx) {
+		log_err("Failed to malloc ctx");
+		goto fail_ctx;
+	}
+	memset(ctx, 0, sizeof(*ctx));
+
+	snprintf(ctx->pkt_fname, sizeof(ctx->pkt_fname),
+		 PCAP_DIR "/packets-%d-%d.log", getpid(), tmon_seq++);

nit. I wonder if it is useful to also have the netns name in the filename?

Not sure if it is more useful to have the test_num and subtest_num instead of 
pid. Probably doable from looking at test__start_subtest().