Re: 4.16-RC7 WARNING: CPU: 2 PID: 0 at block/blk-mq.c:1400 __blk_mq_delay_run_hw_queue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stefan,

On Tue, Mar 27, 2018 at 12:04:20PM +0200, Stefan Haberland wrote:
> Hi,
> 
> I get the following warning in __blk_mq_delay_run_hw_queue when the
> scheduler is set to mq-deadline for DASD devices on s390.
> 
> What I see is that for whatever reason there is a hctx nr 0 which has no
> hctx->tags pointer set.
> From my observation it is always hctx nr 0 which has a tags NULL pointer in
> it and I see other hctx which have the hctx->tags pointer set correctly.
> 
> 
> [    2.169986] WARNING: CPU: 0 PID: 0 at block/blk-mq.c:1402
> __blk_mq_delay_run_hw_queue+0xe8/0x118
> [    2.170007] Modules linked in:
> [    2.170014] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 4.16.0-rc7-04107-g91a05d9e1d6b-dirty #147
> [    2.170019] Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
> [    2.170024] Krnl PSW : 0000000076fd6c7f 00000000c244c24d
> (__blk_mq_delay_run_hw_queue+0xe8/0x118)
> [    2.170035]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0
> RI:0 EA:3
> [    2.170041] Krnl GPRS: 00000000599ec58a 0000000002a94000 0000000002a94000
> 0000000000000001
> [    2.170047]            0000000000000000 000000006e761e98 0000000002a96980
> 0000000002a96800
> [    2.170052]            0000000002d87ce0 00000000737fbda8 0000000000000000
> 0000000000000001
> [    2.170058]            0000000000000001 0000000000aedd10 00000000737fbc38
> 00000000737fbc00
> [    2.170069] Krnl Code: 00000000006ea3c8: ebaff0a00004 lmg    
> %r10,%r15,160(%r15)
>                           00000000006ea3ce: c0f4ffff5e0d brcl    15,6d5fe8
>                          #00000000006ea3d4: a7f40001 brc     15,6ea3d6
>                          >00000000006ea3d8: e340f0c00004        lg     
> %r4,192(%r15)
>                           00000000006ea3de: ebaff0a00004 lmg    
> %r10,%r15,160(%r15)
>                           00000000006ea3e4: 07f4 bcr     15,%r4
>                           00000000006ea3e6: 41b01100 la      %r11,256(%r1)
>                           00000000006ea3ea: 182a lr      %r2,%r10
> [    2.170158] Call Trace:
> [    2.170205] ([<0000000002a96800>] 0x2a96800)
> [    2.170248]  [<00000000006ea4c0>] blk_mq_run_hw_queue+0xa0/0x100
> [    2.170262]  [<00000000006ea59c>] blk_mq_run_hw_queues+0x7c/0x98
> [    2.170295]  [<00000000006e88f6>] __blk_mq_complete_request+0x10e/0x1e0
> [    2.170300]  [<00000000006e9e30>] blk_mq_complete_request+0x80/0xa0
> [    2.170307]  [<000000000087fad0>] dasd_block_tasklet+0x218/0x480
> [    2.170415]  [<000000000017c3f8>] tasklet_hi_action+0xa0/0x138
> [    2.170434]  [<0000000000a91c10>] __do_softirq+0xc8/0x540
> [    2.170471]  [<000000000017bd4e>] irq_exit+0x136/0x140
> [    2.170478]  [<000000000010c912>] do_IRQ+0x8a/0xb8
> [    2.170518]  [<0000000000a90ee0>] io_int_handler+0x138/0x2e0
> [    2.170524]  [<0000000000102cd0>] enabled_wait+0x58/0x128
> [    2.170562] ([<0000000000102cb8>] enabled_wait+0x40/0x128)
> [    2.170577]  [<000000000010319a>] arch_cpu_idle+0x32/0x48
> [    2.170604]  [<0000000000a8f636>] default_idle_call+0x3e/0x58
> [    2.170613]  [<00000000001cd5d2>] do_idle+0xda/0x190
> [    2.170621]  [<00000000001cd93e>] cpu_startup_entry+0x3e/0x48
> [    2.170633]  [<0000000000e5ebf4>] start_kernel+0x47c/0x490
> [    2.170641]  [<0000000000100020>] _stext+0x20/0x80
> [    2.170650] 2 locks held by swapper/0/0:
> [    2.170658]  #0:  (&(&dq->lock)->rlock){..-.}, at: [<00000000b45eaf9e>]
> dasd_block_tasklet+0x1cc/0x480
> [    2.170676]  #1:  (rcu_read_lock){....}, at: [<00000000bc7fa045>]
> hctx_lock+0x34/0x110
> [    2.170750] Last Breaking-Event-Address:
> [    2.170758]  [<00000000006ea3d4>] __blk_mq_delay_run_hw_queue+0xe4/0x118
> [    2.170803] ---[ end trace 1073cf0de1fd32d0 ]---

This warning is harmless, please try the following patch:

--
>From 7b2b5139bfef80f44d1b1424e09ab35b715fbfdb Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei@xxxxxxxxxx>
Date: Tue, 27 Mar 2018 19:54:23 +0800
Subject: [PATCH] blk-mq: only run mapped hw queues in blk_mq_run_hw_queues()

>From commit 20e4d813931961fe ("blk-mq: simplify queue mapping & schedule
with each possisble CPU") on, it should be easier to see unmapped hctx
in some CPU topo, such as, hctx may not be mapped to any CPU.

This patch avoids the warning in __blk_mq_delay_run_hw_queue() by
checking if the hctx is mapped in blk_mq_run_hw_queues().

Reported-by: Stefan Haberland <sth@xxxxxxxxxxxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Fixes: 20e4d813931961fe ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
 block/blk-mq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 16e83e6df404..48f25a63833b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1459,7 +1459,7 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async)
 	int i;
 
 	queue_for_each_hw_ctx(q, hctx, i) {
-		if (blk_mq_hctx_stopped(hctx))
+		if (blk_mq_hctx_stopped(hctx) || !blk_mq_hw_queue_mapped(hctx))
 			continue;
 
 		blk_mq_run_hw_queue(hctx, async);
-- 
2.9.5


-- 
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux