On Tue, Aug 23, 2022 at 10:13 AM Song Liu <song@xxxxxxxxxx> wrote: > > On Mon, Aug 22, 2022 at 8:15 PM Thomas Deutschmann <whissi@xxxxxxxxx> wrote: > > > > On 2022-08-23 03:37, Song Liu wrote: > > > Thomas, have you tried to bisect with the fio repro? > > > > Yes, just finished: > > > > > d32d3d0b47f7e34560ae3c55ddfcf68694813501 is the first bad commit > > > commit d32d3d0b47f7e34560ae3c55ddfcf68694813501 > > > Author: Christoph Hellwig > > > Date: Mon Jun 14 13:17:34 2021 +0200 > > > > > > nvme-multipath: set QUEUE_FLAG_NOWAIT > > > > > > The nvme multipathing code just dispatches bios to one of the blk-mq > > > based paths and never blocks on its own, so set QUEUE_FLAG_NOWAIT > > > to support REQ_NOWAIT bios. > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d32d3d0b47f7e34560ae3c55ddfcf68694813501 > > > > > > So another NOWAIT issue -- similar to the bad commit which is causing > > the mdraid issue I already found > > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0f9650bd838efe5c52f7e5f40c3204ad59f1964d). > > > > Reverting the commit, i.e. deleting > > > > blk_queue_flag_set(QUEUE_FLAG_NOWAIT, head->disk->queue); > > > > fixes the problem for me. Well, sort of. Looks like this will disable > > io_uring. fio reproducer fails with > > My system doesn't have multipath enabled. I guess bisect will point to something > else here. > > I am afraid we won't get more information from bisect. OK, I am able to pinpoint the issue, and Jens found the proper fix for it (see below, also available in [1]). It survived 100 runs of the repro fio job. Thomas, please give it a try. Thanks, Song diff --git c/fs/io_uring.c w/fs/io_uring.c index 3f8a79a4affa..72a39f5ec5a5 100644 --- c/fs/io_uring.c +++ w/fs/io_uring.c @@ -4551,7 +4551,12 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags) copy_iov: iov_iter_restore(&s->iter, &s->iter_state); ret = io_setup_async_rw(req, iovec, s, false); - return ret ?: -EAGAIN; + if (!ret) { + if (kiocb->ki_flags & IOCB_WRITE) + kiocb_end_write(req); + return -EAGAIN; + } + return 0; } out_free: /* it's reportedly faster than delegating the null check to kfree() */ [1] https://lore.kernel.org/stable/a603cfc5-9ba5-20c3-3fec-2c4eec4350f7@xxxxxxxxx/T/#u