Thank you for doing this. The fix will help, but it is not a complete solution. +1 to Jesse's comment about putting into a descriptive helper. This functionality might be useful elsewhere. I'd like to reiterate that the problem is broader. For instance, if I hack the IOCTL path to be immune from the hung task watchdog from firing there, the problem simply moves to another I/O that is behind the IOCTL. For instance if I have a dd command doing direct I/O against the same device that has the IOCTL pending, the dd will hang in uninterruptible sleep and eventually the hung task timer will fire. [ 5645.712955] dd D 634a749ef8dd8750 0 5126 4978 0x00000000 last_sleep: 5568344242277. last_runnable: 5568342587878 [ 5645.712965] ffff88005cfb2c00 ffff88007ab14940 ffff88007a6df080 ffff8800614a4b00 [ 5645.712973] 0000000000014940 ffff880076f77a00 ffffffffb1076a57 ffff880000000000 [ 5645.712981] ffff8800614a4b00 7fffffffffffffff ffff880075402940 ffff880076f77a38 [ 5645.712988] Call Trace: [ 5645.713000] [<ffffffffb1076a57>] ? __schedule+0x37a/0x7a2 [ 5645.713005] [<ffffffffb10766a7>] schedule+0x3f/0x75 [ 5645.713009] [<ffffffffb1079410>] schedule_timeout+0x53/0x6af [ 5645.713015] [<ffffffffb1077425>] io_schedule_timeout+0x6b/0x91 [ 5645.713021] [<ffffffffb09dd66a>] dio_await_one+0x94/0xdb [ 5645.713025] [<ffffffffb09dc668>] __blockdev_direct_IO+0x6d5/0x763 [ 5645.713029] [<ffffffffb09dbde9>] ? blkdev_direct_IO+0x3b/0x3b [ 5645.713034] [<ffffffffb09dbde9>] ? blkdev_direct_IO+0x3b/0x3b [ 5645.713039] [<ffffffffb09dbde3>] blkdev_direct_IO+0x35/0x3b [ 5645.713044] [<ffffffffb095bf30>] generic_file_direct_write+0xfb/0x16d [ 5645.713050] [<ffffffffb098c661>] __generic_file_write_iter+0x2b0/0x3c1 [ 5645.713056] [<ffffffffb0bffe9d>] ? write_port+0xdc/0xdc [ 5645.713061] [<ffffffffb0b506de>] ? __clear_user+0x41/0x66 [ 5645.713065] [<ffffffffb09db78f>] blkdev_write_iter+0xd1/0x13e [ 5645.713072] [<ffffffffb0a66977>] SyS_write+0x22d/0x431 [ 5645.713077] [<ffffffffb107b163>] entry_SYSCALL_64_fastpath+0x31/0xab Similarly, I am sure there are filesystem specific paths that can potentially hang. I suspect that the flushing code in ext4 might be one such path. These are just examples. Also, my worry persists that we are creating more places for bugs to hide by blanket immunizing I/O paths from the hung task timer. Having said all of that, I don't really have a significantly better in-kernel solution. Reviewed-by: Salman Qazi <sqazi@xxxxxxxxxx> On Thu, Feb 20, 2020 at 6:43 AM Jesse Barnes <jsbarnes@xxxxxxxxxx> wrote: > > Yeah looks good. > > Reviewed-by: Jesse Barnes <jsbarnes@xxxxxxxxxx> > > A further change (just for readability) might be to factor out these > "don't trigger hangcheck" waits from here and blk_execute_rq() into a > small helper with a descriptive name. > > Thanks, > Jesse > > On Thu, Feb 20, 2020 at 5:05 AM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > submit_bio_wait() can be called from ioctl(BLKSECDISCARD), which > > may take long time to complete, as Salman mentioned, 4K BLKSECDISCARD > > takes up to 100 second on some devices. Also any block I/O operation > > that occurs after the BLKSECDISCARD is submitted will also potentially > > be affected by the hung task timeouts. > > > > So prevent hung_check from firing by taking same approach used > > in blk_execute_rq(), and the wake-up interval is set as half the > > hung_check timer period, which keeps overhead low enough. > > > > Cc: Salman Qazi <sqazi@xxxxxxxxxx> > > Cc: Jesse Barnes <jsbarnes@xxxxxxxxxx> > > Cc: Bart Van Assche <bvanassche@xxxxxxx> > > Link: https://lkml.org/lkml/2020/2/12/1193 > > Reported-by: Salman Qazi <sqazi@xxxxxxxxxx> > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > block/bio.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/block/bio.c b/block/bio.c > > index 94d697217887..c9ce19a86de7 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -17,6 +17,7 @@ > > #include <linux/cgroup.h> > > #include <linux/blk-cgroup.h> > > #include <linux/highmem.h> > > +#include <linux/sched/sysctl.h> > > > > #include <trace/events/block.h> > > #include "blk.h" > > @@ -1019,12 +1020,19 @@ static void submit_bio_wait_endio(struct bio *bio) > > int submit_bio_wait(struct bio *bio) > > { > > DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map); > > + unsigned long hang_check; > > > > bio->bi_private = &done; > > bio->bi_end_io = submit_bio_wait_endio; > > bio->bi_opf |= REQ_SYNC; > > submit_bio(bio); > > - wait_for_completion_io(&done); > > + > > + /* Prevent hang_check timer from firing at us during very long I/O */ > > + hang_check = sysctl_hung_task_timeout_secs; > > + if (hang_check) > > + while (!wait_for_completion_io_timeout(&done, hang_check * (HZ/2))); > > + else > > + wait_for_completion_io(&done); > > > > return blk_status_to_errno(bio->bi_status); > > } > > -- > > 2.20.1 > >