On 1/26/23 11:04 AM, Saeed Mirzamohammadi wrote: > Hi Jens, > >> On Jan 25, 2023, at 4:28 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 1/25/23 5:22?PM, Saeed Mirzamohammadi wrote: >>> Hi Jens, >>> >>> I applied your patch (with a minor conflict in xfs_file_open() since FMODE_BUF_WASYNC isn't in v5.15) and did the same series of tests on the v5.15 kernel. All the io_uring benchmarks regressed 20-45% after it. I haven't tested on v6.1 yet. >> >> It should basically make the behavior the same as before once you apply >> the patch, so please pass on the patch that you applied for 5.15 so we >> can take a closer look. > > Attached the patch. I tested the upstream variant, and it does what it's supposed to and gets parallel writes on O_DIRECT. Unpatched, any dio write results in: fio-566 [000] ..... 131.071108: io_uring_queue_async_work: ring 00000000706cb6c0, request 00000000b21691c4, user_data 0xaaab0e8e4c00, opcode WRITE, flags 0xe0040000, hashed queue, work 000000002c5aeb79 and after the patch: fio-376 [000] ..... 24.590994: io_uring_queue_async_work: ring 000000007bdb650a, request 000000006b5350e0, user_data 0xaaab1b3e3c00, opcode WRITE, flags 0xe0040000, normal queue, work 00000000e3e81955 where the hashed queued is serialized based on the inode, and the normal queue is not (eg they run in parallel). As mentioned, the fio job being used isn't representative of anything that should actually be run, the async flag really only exists for experimentation. Do you have a real workload that is seeing a regression? If yes, does that real workload change performance with the patch? -- Jens Axboe