Sync polling also needs REQ_NOWAIT flag. One sync read/write may be split into several bios (and thus several requests), and can used up the queue depth sometimes. Thus the following bio in the same sync read/write will wait for usable request if REQ_NOWAIT flag not set, in which case the following sync polling will cause a deadlock. One case (maybe the only case) for above situation is preadv2/pwritev2 + direct + highpri. Two conditions need to be satisfied to trigger the deadlock. 1. HIPRI IO in sync routine. Normal read(2)/pread(2)/readv(2)/preadv(2) and corresponding write family syscalls don't support high-priority IO and thus won't trigger polling routine. Only preadv2(2)/pwritev2(2) supports high-priority IO by RWF_HIPRI flag of @flags parameter. 2. Polling support in sync routine. Currently both the blkdev and iomap-based fs (ext4/xfs, etc) support polling in direct IO routine. The general routine is described as follows. submit_bio wait for blk_mq_get_tag(), waiting for requests completion, which should be done by the following polling, thus causing a deadlock. blk_poll poll in current process context This issue can be reproduced by following steps: 1. set max_sectors_kb to '4', and set nr_requests to '4'. This deliberate setting is imposed to stress the issue. 2. fio test with the following configuration: ioengine=pvsync2 iodepth=128 numjobs=1 thread rw=randread direct=1 hipri=1 bs=4M filename=/dev/nvme0n1 In this case one sync read will be split into 1024 bios and will exhaust the queue depth rapidly, causing the following bio waiting in the following stack forever. [<0>] blk_mq_get_tag+0x19d/0x290 [<0>] __blk_mq_alloc_request+0x5c/0x130 [<0>] blk_mq_submit_bio+0x103/0x5d0 [<0>] submit_bio_noacct+0x309/0x400 [<0>] submit_bio+0x81/0x190 [<0>] blkdev_direct_IO+0x44c/0x4a0 [<0>] generic_file_read_iter+0x80/0x160 [<0>] do_iter_readv_writev+0x186/0x1c0 [<0>] do_iter_read+0xc6/0x1c0 [<0>] vfs_readv+0x7e/0xc0 [<0>] do_preadv+0xad/0xc0 [<0>] do_syscall_64+0x2d/0x40 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix: 675d9e95538e5 ("block: add bio_set_polled() helper") Signed-off-by: Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx> --- include/linux/bio.h | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index c6d765382926..884ee5aa3df4 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -800,17 +800,15 @@ static inline int bio_integrity_add_page(struct bio *bio, struct page *page, #endif /* CONFIG_BLK_DEV_INTEGRITY */ /* - * Mark a bio as polled. Note that for async polled IO, the caller must - * expect -EWOULDBLOCK if we cannot allocate a request (or other resources). + * Mark a bio as polled. Note that for polled IO, the caller must expect + * -EWOULDBLOCK if we cannot allocate a request (or other resources). * We cannot block waiting for requests on polled IO, as those completions * must be found by the caller. This is different than IRQ driven IO, where * it's safe to wait for IO to complete. */ static inline void bio_set_polled(struct bio *bio, struct kiocb *kiocb) { - bio->bi_opf |= REQ_HIPRI; - if (!is_sync_kiocb(kiocb)) - bio->bi_opf |= REQ_NOWAIT; + bio->bi_opf |= REQ_HIPRI | REQ_NOWAIT; } #endif /* __LINUX_BIO_H */ -- 2.27.0