On Mon, May 23, 2022 at 04:36:04PM -0600, Keith Busch wrote: > On Wed, Apr 20, 2022 at 10:31:10PM +0800, Ming Lei wrote: > > So far bio is marked as REQ_POLLED if RWF_HIPRI/IOCB_HIPRI is passed > > from userspace sync io interface, then block layer tries to poll until > > the bio is completed. But the current implementation calls > > blk_io_schedule() if bio_poll() returns 0, and this way causes io hang or > > timeout easily. > > Wait a second. The task's current state is TASK_RUNNING when bio_poll() returns > zero, so calling blk_io_schedule() isn't supposed to hang. void __sched io_schedule(void) { int token; token = io_schedule_prepare(); schedule(); io_schedule_finish(token); } But who can wakeup this task after scheduling out? There can't be irq handler for POLLED request. The hang can be triggered on nvme/qemu reliably: fio --bs=4k --size=1G --ioengine=pvsync2 --norandommap --hipri=1 --iodepth=64 \ --slat_percentiles=1 --nowait=0 --filename=/dev/nvme0n1 --direct=1 \ --runtime=10 --numjobs=1 --rw=rw --name=test --group_reporting Thanks, Ming