Hi everyone, I noticed in iotop that all writes are executed by the same thread (io_wqe_worker-0). This is a significant problem if I am using files with mentioned flags. Not the case with reads, requests are multiplexed over many threads (note the different name io_wqe_worker-1). The problem is not specific to O_SYNC, in the general case I can get higher throughput with thread pool and regular system calls, but specifically with O_SYNC the throughput is the same as if I were using a single thread for writing. The setup is always the same, ring per thread with shared workers pool (IORING_SETUP_ATTACH_WQ), and high submission rate. Also, it is possible to get around this performance issue by using separate worker pools, but then I have to load balance workload between many rings for perf gains. I thought that it may have something to do with the IOSQE_ASYNC flag, but setting it had no effect. Is it expected behavior? Are there any other solutions, except creating many rings with isolated worker pools?