This patchset improves skmsg ingress redirection performance by a) sophisticated batching with kworker; b) skmsg allocation caching with kmem cache. As a result, our patches significantly outperforms the vanilla kernel in terms of throughput for almost all packet sizes. The percentage improvement in throughput ranges from 3.13% to 160.92%, with smaller packets showing the highest improvements. For latency, it induces slightly higher latency across most packet sizes compared to the vanilla, which is also expected since this is a natural side effect of batching. Please see the detailed benchmarks: +-------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+ | Throughput | 64 | 128 | 256 | 512 | 1k | 4k | 16k | 32k | 64k | 128k | 256k | +-------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+ | Vanilla | 0.17±0.02 | 0.36±0.01 | 0.72±0.02 | 1.37±0.05 | 2.60±0.12 | 8.24±0.44 | 22.38±2.02 | 25.49±1.28 | 43.07±1.36 | 66.87±4.14 | 73.70±7.15 | | Patched | 0.41±0.01 | 0.82±0.02 | 1.62±0.05 | 3.33±0.01 | 6.45±0.02 | 21.50±0.08 | 46.22±0.31 | 50.20±1.12 | 45.39±1.29 | 68.96±1.12 | 78.35±1.49 | | Percentage | 141.18% | 127.78% | 125.00% | 143.07% | 148.08% | 160.92% | 106.52% | 97.00% | 5.38% | 3.13% | 6.32% | +-------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+ +-------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | Latency | 64 | 128 | 256 | 512 | 1k | 4k | 16k | 32k | 63k | +-------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ | Vanilla | 5.80±4.02 | 5.83±3.61 | 5.86±4.10 | 5.91±4.19 | 5.98±4.14 | 6.61±4.47 | 8.60±2.59 | 10.96±5.50| 15.02±6.78| | Patched | 6.18±3.03 | 6.23±4.38 | 6.25±4.44 | 6.13±4.35 | 6.32±4.23 | 6.94±4.61 | 8.90±5.49 | 11.12±6.10| 14.88±6.55| | Percentage | 6.55% | 6.87% | 6.66% | 3.72% | 5.68% | 4.99% | 3.49% | 1.46% |-0.93% | +-------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ --- v2: improved commit message of patch 3/4 changed to 'u8' for bitfields, as suggested by Jakub Cong Wang (2): skmsg: rename sk_msg_alloc() to sk_msg_expand() skmsg: save some space in struct sk_psock Zijian Zhang (2): skmsg: implement slab allocator cache for sk_msg tcp_bpf: improve ingress redirection performance with message corking include/linux/skmsg.h | 48 +++++++--- net/core/skmsg.c | 173 ++++++++++++++++++++++++++++++++--- net/ipv4/tcp_bpf.c | 204 +++++++++++++++++++++++++++++++++++++++--- net/tls/tls_sw.c | 6 +- net/xfrm/espintcp.c | 2 +- 5 files changed, 394 insertions(+), 39 deletions(-) -- 2.34.1