On 8/17/21 9:32 PM, Yunsheng Lin wrote: > This patchset adds the socket to netdev page frag recycling > support based on the busy polling and page pool infrastructure. > > The profermance improve from 30Gbit to 41Gbit for one thread iperf > tcp flow, and the CPU usages decreases about 20% for four threads > iperf flow with 100Gb line speed in IOMMU strict mode. > > The profermance improve about 2.5% for one thread iperf tcp flow > in IOMMU passthrough mode. > Details about the test setup? cpu model, mtu, any other relevant changes / settings. How does that performance improvement compare with using the Tx ZC API? At 1500 MTU I see a CPU drop on the Tx side from 80% to 20% with the ZC API and ~10% increase in throughput. Bumping the MTU to 3300 and performance with the ZC API is 2x the current model with 1/2 the cpu. Epyc 7502, ConnectX-6, IOMMU off. In short, it seems like improving the Tx ZC API is the better path forward than per-socket page pools.