On Wed, Jan 29, 2025 at 6:19 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > The other patches look pretty straight forward to me. Only thing that > has me puzzled a bit is why you have so much io-wq activity with your > application, in general I'd expect 0 activity there. But Then I saw the > forced ASYNC flag, and it makes sense. In general, forcing that isn't a > great idea, but for a benchmark for io-wq it certainly makes sense. I was experimenting with io_uring and wanted to see how much performance I can squeeze out of my web server running single-threaded. The overhead of io_uring_submit() grew very large, because the "send" operation would do a lot of synchronous work in the kernel. I tried SQPOLL but it was actually a big performance regression; this just shifted my CPU usage to epoll_wait(). Forcing ASYNC gave me large throughput improvements (moving the submission overhead to iowq), but then the iowq lock contention was the next limit, thus this patch series. I'm still experimenting, and I will certainly revisit SQPOLL to learn more about why it didn't help and how to fix it. > I'll apply 1-7 once 6.14-rc1 is out and I can kick off a > for-6.15/io_uring branch. Thanks! Thanks Jens, and please let me know when you're ready to discuss the last patch. It's a big improvement for those who combine io_uring with epoll, it's worth it. Max