Re: io_uring networking performance degradation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/19/21 10:13 AM, Michael Stoler wrote:
> We are trying to reproduce reported on page
> https://github.com/frevib/io_uring-echo-server/blob/master/benchmarks/benchmarks.md
> results with a more realistic environment:
> 1. Internode networking in AWS cluster with i3.16xlarge nodes type(25
> Gigabit network connection between client and server)
> 2. 128 and 2048 packet sizes, to simulate typical payloads
> 3. 10 clients to get 75-95% CPU utilization by server to simulate
> server's normal load
> 4. 20 clients to get 100% CPU utilization by server to simulate
> server's hard load
> 
> Software:
> 1. OS: Ubuntu 20.04.2 LTS HWE with 5.8.0-45-generic kernel with latest liburing
> 2. io_uring-echo-server: https://github.com/frevib/io_uring-echo-server
> 3. epoll-echo-server: https://github.com/frevib/epoll-echo-server
> 4. benchmark: https://github.com/haraldh/rust_echo_bench
> 5. all commands runs with "hwloc-bind os=eth1"
> 
> The results are confusing, epoll_echo_server shows stable advantage
> over io_uring-echo-server, despite reported advantage of
> io_uring-echo-server:
> 
> 128 bytes packet size, 10 clients, 75-95% CPU core utilization by server:
> echo_bench --address '172.22.117.67:7777' -c 10 -t 60 -l 128
> epoll_echo_server:      Speed: 80999 request/sec, 80999 response/sec
> io_uring_echo_server:   Speed: 74488 request/sec, 74488 response/sec
> epoll_echo_server is 8% faster
> 
> 128 bytes packet size, 20 clients, 100% CPU core utilization by server:
> echo_bench --address '172.22.117.67:7777' -c 20 -t 60 -l 128
> epoll_echo_server:      Speed: 129063 request/sec, 129063 response/sec
> io_uring_echo_server:    Speed: 102681 request/sec, 102681 response/sec
> epoll_echo_server is 25% faster
> 
> 2048 bytes packet size, 10 clients, 75-95% CPU core utilization by server:
> echo_bench --address '172.22.117.67:7777' -c 10 -t 60 -l 2048
> epoll_echo_server:       Speed: 74421 request/sec, 74421 response/sec
> io_uring_echo_server:    Speed: 66510 request/sec, 66510 response/sec
> epoll_echo_server is 11% faster
> 
> 2048 bytes packet size, 20 clients, 100% CPU core utilization by server:
> echo_bench --address '172.22.117.67:7777' -c 20 -t 60 -l 2048
> epoll_echo_server:       Speed: 108704 request/sec, 108704 response/sec
> io_uring_echo_server:    Speed: 85536 request/sec, 85536 response/sec
> epoll_echo_server is 27% faster
> 
> Why io_uring shows consistent performance degradation? What is going wrong?

5.8 is pretty old, and I'm not sure all the performance problems were
addressed there. Apart from missing common optimisations as you may
have seen in the thread, it looks to me it doesn't have sighd(?) lock
hammering fix. Jens, knows better it has been backported or not.

So, things you can do:
1) try out 5.12
2) attach `perf top` output or some other profiling for your 5.8
3) to have a more complete picture do 2) with 5.12

Let's find what's gone wrong

-- 
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux