eBPF sockmap extensively employs skmsg for its datapath. Although it is much simpler than the traditional skbuff, it is not as sophisticated either. As a result, there are numerous optimizations lacking on the skmsg datapath. For example, the TCP_BPF ingress redirection path currently lacks the message corking mechanism that is extensively utilized in the standard TCP/IP transmission path. This causes the sender to wake up the receiver for every message, even when the messages are small, leading to lower throughput compared to regular TCP in certain scenarios. We propose an optimization by introducing a kernel-worker-based intermediate layer to provide automatic message corking for TCP_BPF. Although this incurs a minor latency overhead, it significantly enhances the overall throughput by reducing unnecessary wake-ups and minimizing the socket lock contention. Our results indicate that, compared with vanilla TCP_BPF, the throughput is improved by 5% to 160% depending on the message size, with a negligible latency sacrifice of approximately 2% on average (still much better than loopback TCP). In the future, we aim to explore the possibility of eliminating the socket lock from this code path entirely. Another performance bottleneck lies in data copying resulting from the conversion of skbuff and skmsg on the socket transmission path. Essentially, the ->sendmsg() callback is not suitable for directly transmitting skbuff. We propose a new socket callback that can enqueue skbuff directly, which could eliminate this unnecessary data copying. BTW: Zijian's work is available for review on github https://github.com/Sm0ckingBird/linux/commits/tcp_bpf/ , we are just waiting for bpf-next to merge with bpf to submit it. Thanks!