On 1/29/25 10:39 AM, Max Kellermann wrote: > On Wed, Jan 29, 2025 at 6:19?PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> The other patches look pretty straight forward to me. Only thing that >> has me puzzled a bit is why you have so much io-wq activity with your >> application, in general I'd expect 0 activity there. But Then I saw the >> forced ASYNC flag, and it makes sense. In general, forcing that isn't a >> great idea, but for a benchmark for io-wq it certainly makes sense. > > I was experimenting with io_uring and wanted to see how much > performance I can squeeze out of my web server running > single-threaded. The overhead of io_uring_submit() grew very large, > because the "send" operation would do a lot of synchronous work in the > kernel. I tried SQPOLL but it was actually a big performance > regression; this just shifted my CPU usage to epoll_wait(). Forcing > ASYNC gave me large throughput improvements (moving the submission > overhead to iowq), but then the iowq lock contention was the next > limit, thus this patch series. > > I'm still experimenting, and I will certainly revisit SQPOLL to learn > more about why it didn't help and how to fix it. Why are you combining it with epoll in the first place? It's a lot more efficient to wait on a/multiple events in io_uring_enter() rather than go back to a serialize one-event-per-notification by using epoll to wait on completions on the io_uring side. -- Jens Axboe