Re: [PATCH] block: don't call blk_mq_run_hw_queues() for dead or dying queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/26/19 4:12 PM, Bart Van Assche wrote:
On Tue, 2019-03-26 at 15:08 +0100, Hannes Reinecke wrote:
On 3/26/19 2:43 PM, Bart Van Assche wrote:
On 3/26/19 5:07 AM, Hannes Reinecke wrote:
When a queue is dying or dead there is no point in calling
blk_mq_run_hw_queues() in blk_mq_unquiesce_queue(); in fact, doing
so might crash the machine as the queue structures are in the
process of being deleted.

Signed-off-by: Hannes Reinecke <hare@xxxxxxxx>
---
   block/blk-mq.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index a9c181603cbd..b1eeba38bc79 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -258,7 +258,8 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
       blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
       /* dispatch requests which are inserted during quiescing */
-    blk_mq_run_hw_queues(q, true);
+    if (!blk_queue_dying(q) && !blk_queue_dead(q))
+        blk_mq_run_hw_queues(q, true);
   }
   EXPORT_SYMBOL_GPL(blk_mq_unquiesce_queue);

Hi Hannes,

Please provide more context information. In the "dead" state the queue
must be run to make sure that all requests that were queued before the
"dead" state get processed. The blk_cleanup_queue() function is
responsible for stopping all code that can run the queue after all
requests have finished and before destruction of the data structures
needed for request processing starts.


I have a crash with two processes competing for the same controller:

#0  0xffffffff983d3bcb in sbitmap_any_bit_set (sb=0xffff8a1b874ba0d8)
      at ../lib/sbitmap.c:181
#1  0xffffffff98366c05 in blk_mq_hctx_has_pending (hctx=0xffff8a1b874ba000)
      at ../block/blk-mq.c:66
#2  0xffffffff98366c85 in blk_mq_run_hw_queues (q=0xffff8a1b874ba0d8,
async=true) at ../block/blk-mq.c:1292
#3  0xffffffff98366d3a in blk_mq_unquiesce_queue (q=<optimized out>)
      at ../block/blk-mq.c:265
#4  0xffffffffc01f3e0e in nvme_start_queues (ctrl=<optimized out>)
      at ../drivers/nvme/host/core.c:3658
#5  0xffffffffc01e843c in nvme_fc_delete_association
(ctrl=0xffff8a1f9be5a000)
      at ../drivers/nvme/host/fc.c:2843
#6  0xffffffffc01e8544 in nvme_fc_delete_association (ctrl=<optimized out>)
      at ../drivers/nvme/host/fc.c:2918
#7  0xffffffffc01e8544 in __nvme_fc_terminate_io (ctrl=0xffff8a1f9be5a000)
      at ../drivers/nvme/host/fc.c:2911
#8  0xffffffffc01e8f09 in nvme_fc_reset_ctrl_work (work=0xffff8a1f9be5a6d0)
      at ../drivers/nvme/host/fc.c:2927
#9  0xffffffff980a224a in process_one_work (worker=0xffff8a1b73934f00,
work=0xffff8a1f9be5a6d0) at ../kernel/workqueue.c:2092
#10 0xffffffff980a249b in worker_thread (__worker=0xffff8a1b73934f00)
      at ../kernel/workqueue.c:2226

#7  0xffffffff986d2e9a in wait_for_completion (x=0xffffa48eca88bc40)
      at ../kernel/sched/completion.c:125
#8  0xffffffff980f25ae in __synchronize_srcu (sp=0xffffffff9914fc20
<debugfs_srcu>, do_norm=<optimized out>) at ../kernel/rcu/srcutree.c:851
#9  0xffffffff982d18b1 in debugfs_remove_recursive (dentry=<optimized out>)
      at ../fs/debugfs/inode.c:741
#10 0xffffffff98398ac5 in blk_mq_debugfs_unregister_hctx
(hctx=0xffff8a1b7cccc000) at ../block/blk-mq-debugfs.c:897
#11 0xffffffff983661cf in blk_mq_exit_hctx (q=0xffff8a1f825e4040,
set=0xffff8a1f9be5a0c0, hctx=0xffff8a1b7cccc000, hctx_idx=2) at
../block/blk-mq.c:1987
#12 0xffffffff9836946a in blk_mq_exit_hw_queues (nr_queue=<optimized
out>, set=<optimized out>, q=<optimized out>) at ../block/blk-mq.c:2017
#13 0xffffffff9836946a in blk_mq_free_queue (q=0xffff8a1f825e4040)
      at ../block/blk-mq.c:2506
#14 0xffffffff9835aac5 in blk_cleanup_queue (q=0xffff8a1f825e4040)
      at ../block/blk-core.c:691
#15 0xffffffffc01f5bc8 in nvme_ns_remove (ns=0xffff8a1f819e8f80)
      at ../drivers/nvme/host/core.c:3138
#16 0xffffffffc01f6fea in nvme_validate_ns (ctrl=0xffff8a1f9be5a308, nsid=5)
      at ../drivers/nvme/host/core.c:3164
#17 0xffffffffc01f9053 in nvme_scan_ns_list (nn=<optimized out>,
ctrl=<optimized out>) at ../drivers/nvme/host/core.c:3202
#18 0xffffffffc01f9053 in nvme_scan_work (work=<optimized out>)
      at ../drivers/nvme/host/core.c:3280
#19 0xffffffff980a224a in process_one_work (worker=0xffff8a1b7349f6c0,
work=0xffff8a1f9be5aba0) at ../kernel/workqueue.c:2092

Point is that the queue is already dead by the time nvme_start_queues()
tries to flush existing requests (of which there are none, of course).
I had been looking into synchronizing scan_work and reset_work, but then
I wasn't sure if that wouldn't deadlock somewhere.

James, do you agree that nvme_fc_reset_ctrl_work should be canceled before
nvme_ns_remove() is allowed to call blk_cleanup_queue()?

Hmm. I would've thought exactly the other way round, namely cancelling/flushing the scan_work element when calling reset; after all, reset definitely takes precedence to scanning the controller (which won't work anyway as the controller is about to be reset...)

Cheers,

Hannes
--
Dr. Hannes Reinecke		               zSeries & Storage
hare@xxxxxxxx			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux