On 10/29/23 1:11 AM, Peilin Ye wrote:
On Sat, Oct 28, 2023 at 09:06:44AM +0200, Daniel Borkmann wrote:
diff --git a/net/core/filter.c b/net/core/filter.c
index 21d75108c2e9..7aca28b7d0fd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2492,6 +2492,7 @@ int skb_do_redirect(struct sk_buff *skb)
net_eq(net, dev_net(dev))))
goto out_drop;
skb->dev = dev;
+ dev_sw_netstats_rx_add(dev, skb->len);
This assumes that all devices that support BPF_F_PEER (currently only
veth) use tstats (instead of lstats, or dstats) - is that okay?
Dumb question, but why all this change and not simply just call ...
dev_lstats_add(dev, skb->len)
... on the host dev ?
Since I didn't want to update host-veth's TX counters. If we
bpf_redirect_peer()ed a packet from NIC TC ingress to Pod-veth TC ingress,
I think it means we've bypassed host-veth TX?
Yes. So the idea is to transition to tstats replace the location where
we used to bump lstats with tstat's tx counter, and only the peer redirect
would bump the rx counter.. then upon stats traversal we fold the latter into
the rx stats which was populated by the opposite's tx counters. Makes sense.
OT: does cadvisor run inside the Pod to collect the device stats? Just
curious how it gathers them.
If not, should I add another NDO e.g. ->ndo_stats_rx_add()?
Definitely no new stats ndo resp indirect call in fast path.
Yeah, I think I'll put a comment saying that all devices that support
BPF_F_PEER must use tstats (or must use lstats), then.
sgtm.
Thanks,
Daniel