On Tue, 4 Jun 2019 09:28:22 +0200 Tom Barbette <barbette@xxxxxx> wrote: > Thanks Jesper for looking into this! > > I don't think I will be of much help further on this matter. My take > out would be: as a first-time user looking into XDP after watching a > dozen of XDP talks, I would have expected XDP default settings to be > identical to SKB, so I don't have to watch out for a set of > per-driver parameter checklist to avoid increasing my CPU consumption > by 15% when inserting "a super efficient and light BPF program". But > I understand it's not that easy... The gap should not be this large, but as I demonstrated it was primarily because you hit an unfortunate interaction with TCP and how the mlx5 driver does page-caching (p.s. we are working on removing this driver local recycle-cache). When loading an XDP/eBPF-prog then the driver change the underlying RX memory model, which waste memory to gain packets-per-sec speed, but TCP sees this memory waste and gives us a penalty. It is important to understand, that XDP is not optimized for TCP. XDP is designed and optimized for L2-L3 handling of packets (TCP is L4). Before XDP these L2-L3 use-cases were "slow", because the kernel netstack assumes a L4/socket use-case (full SKB), when less was really needed. This is actually another good example of why XDP programs per RX-queue, will be useful (notice: which is not implemented upstream, yet...). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer