On 05/03/2017 11:08 AM, Bart Van Assche wrote: > On Thu, 2017-05-04 at 00:52 +0800, Ming Lei wrote: >> Looks v4.11 plus your for-linus often triggers the following hang during >> boot, and it seems caused by the change in (blk-mq: unify hctx delayed_run_work >> and run_work) >> >> >> BUG: scheduling while atomic: kworker/0:1H/704/0x00000002 >> Modules linked in: >> Preemption disabled at: >> [<ffffffffaf5607bb>] virtio_queue_rq+0xdb/0x350 >> CPU: 0 PID: 704 Comm: kworker/0:1H Not tainted 4.11.0-04508-ga1f35f46164b #132 >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.9.3-1.fc25 04/01/2014 >> Workqueue: kblockd blk_mq_run_work_fn >> Call Trace: >> dump_stack+0x65/0x8f >> ? virtio_queue_rq+0xdb/0x350 >> __schedule_bug+0x76/0xc0 >> __schedule+0x610/0x820 >> ? new_slab+0x2c9/0x590 >> schedule+0x40/0x90 >> schedule_timeout+0x273/0x320 >> ? ___slab_alloc+0x3cb/0x4f0 >> wait_for_completion+0x97/0x100 >> ? wait_for_completion+0x97/0x100 >> ? wake_up_q+0x80/0x80 >> flush_work+0x104/0x1a0 >> ? flush_workqueue_prep_pwqs+0x130/0x130 >> __cancel_work_timer+0xeb/0x160 >> ? vp_notify+0x16/0x20 >> ? virtqueue_add_sgs+0x23c/0x4a0 >> cancel_delayed_work_sync+0x13/0x20 >> blk_mq_stop_hw_queue+0x16/0x20 >> virtio_queue_rq+0x316/0x350 >> blk_mq_dispatch_rq_list+0x194/0x350 >> blk_mq_sched_dispatch_requests+0x118/0x170 >> ? finish_task_switch+0x80/0x1e0 >> __blk_mq_run_hw_queue+0xa3/0xc0 >> blk_mq_run_work_fn+0x2c/0x30 >> process_one_work+0x1e0/0x400 >> worker_thread+0x48/0x3f0 >> kthread+0x109/0x140 >> ? process_one_work+0x400/0x400 >> ? kthread_create_on_node+0x40/0x40 >> ret_from_fork+0x2c/0x40 > > Callers of blk_mq_quiesce_queue() really need blk_mq_stop_hw_queue() to > cancel delayed work synchronously. Right. > The above call stack shows that we have to do something about the > blk_mq_stop_hw_queue() calls from inside .queue_rq() functions for > queues for which BLK_MQ_F_BLOCKING has not been set. I'm not sure what > the best approach would be: setting BLK_MQ_F_BLOCKING for queues that > call blk_mq_stop_hw_queue() from inside .queue_rq() or creating two > versions of blk_mq_stop_hw_queue(). Regardless of whether BLOCKING is set or not, we don't have to hard guarantee the flush from the drivers. If they do happen to get a 2nd invocation before being stopped, that doesn't matter. So I think we're fine with the patch I sent out 5 minutes ago, would be great if Ming could test it though. -- Jens Axboe