We are trying to reproduce reported on page https://github.com/frevib/io_uring-echo-server/blob/master/benchmarks/benchmarks.md results with a more realistic environment: 1. Internode networking in AWS cluster with i3.16xlarge nodes type(25 Gigabit network connection between client and server) 2. 128 and 2048 packet sizes, to simulate typical payloads 3. 10 clients to get 75-95% CPU utilization by server to simulate server's normal load 4. 20 clients to get 100% CPU utilization by server to simulate server's hard load Software: 1. OS: Ubuntu 20.04.2 LTS HWE with 5.8.0-45-generic kernel with latest liburing 2. io_uring-echo-server: https://github.com/frevib/io_uring-echo-server 3. epoll-echo-server: https://github.com/frevib/epoll-echo-server 4. benchmark: https://github.com/haraldh/rust_echo_bench 5. all commands runs with "hwloc-bind os=eth1" The results are confusing, epoll_echo_server shows stable advantage over io_uring-echo-server, despite reported advantage of io_uring-echo-server: 128 bytes packet size, 10 clients, 75-95% CPU core utilization by server: echo_bench --address '172.22.117.67:7777' -c 10 -t 60 -l 128 epoll_echo_server: Speed: 80999 request/sec, 80999 response/sec io_uring_echo_server: Speed: 74488 request/sec, 74488 response/sec epoll_echo_server is 8% faster 128 bytes packet size, 20 clients, 100% CPU core utilization by server: echo_bench --address '172.22.117.67:7777' -c 20 -t 60 -l 128 epoll_echo_server: Speed: 129063 request/sec, 129063 response/sec io_uring_echo_server: Speed: 102681 request/sec, 102681 response/sec epoll_echo_server is 25% faster 2048 bytes packet size, 10 clients, 75-95% CPU core utilization by server: echo_bench --address '172.22.117.67:7777' -c 10 -t 60 -l 2048 epoll_echo_server: Speed: 74421 request/sec, 74421 response/sec io_uring_echo_server: Speed: 66510 request/sec, 66510 response/sec epoll_echo_server is 11% faster 2048 bytes packet size, 20 clients, 100% CPU core utilization by server: echo_bench --address '172.22.117.67:7777' -c 20 -t 60 -l 2048 epoll_echo_server: Speed: 108704 request/sec, 108704 response/sec io_uring_echo_server: Speed: 85536 request/sec, 85536 response/sec epoll_echo_server is 27% faster Why io_uring shows consistent performance degradation? What is going wrong? Regards Michael Stoler