Re: [PATCH] blk-mq: don't fail driver tag allocation because of inactive hctx

Ming Lei <ming.lei@xxxxxxxxxx> · Thu, 4 Jun 2020 20:08:16 +0800

On Thu, Jun 04, 2020 at 12:50:32PM +0100, John Garry wrote:
> On 04/06/2020 12:26, Ming Lei wrote:
> > > And another:
> > > 
> > > 51.336698] CPU53: shutdown
> > > [  251.336737] run queue from wrong CPU 0, hctx active
> > > [  251.339545] run queue from wrong CPU 0, hctx active
> > > [  251.344695] psci: CPU53 killed (polled 0 ms)
> > > [  251.349302] run queue from wrong CPU 0, hctx active
> > > [  251.358447] run queue from wrong CPU 0, hctx active
> > > [  251.363336] run queue from wrong CPU 0, hctx active
> > > [  251.368222] run queue from wrong CPU 0, hctx active
> > > [  251.373108] run queue from wrong CPU 0, hctx active
> > > [  252.488831] CPU54: shutdown=1094MiB/s,w=0KiB/s][r=280k,w=0 IOPS][eta
> > > 00m:04s]
> > > [  252.491629] psci: CPU54 killed (polled 0 ms)
> > > [  252.536533] irq_shutdown irq146
> > > [  252.540007] CPU55: shutdown
> > > [  252.543273] psci: CPU55 killed (polled 0 ms)
> > > [  252.656656] CPU56: shutdown
> > > [  252.659446] psci: CPU56 killed (polled 0 ms)
> > > [  252.704576] CPU57: shutdown=947MiB/s,w=0KiB/s][r=242k,w=0 IOPS][eta
> > > 00m:02s]
> > > [  252.707369] psci: CPU57 killed (polled 0 ms)
> > > [  254.352646] CPU58: shutdown=677MiB/s,w=0KiB/s][r=173k,w=0 IOPS][eta
> > > 00m:02s]
> > > [  254.355442] psci: CPU58 killed (polled 0 ms)
> > > long sleep 1
> > > [  279.288227] scsi_times_out req=0xffff041fa10b6600[r=0,w=0 IOPS][eta
> > > 04m:37s]
> > > [  279.320281] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
> > > [  279.326144] sas: trying to find task 0x00000000e8dca422
> > > [  279.331379] sas: sas_scsi_find_task: aborting task 0x00000000e8dca422
> > > 
> > > none scheduler seems ok.
> > Can you reproduce it by just applying the patch of 'blk-mq: don't fail driver tag
> > allocation because of inactive hctx'?
> 
> That's your patch - ok, I can try.
> 
> So I added some debug for when using Christoph's:
> 
> static bool blk_mq_hctx_has_requests2(struct blk_mq_hw_ctx *hctx)
> {
> 	struct blk_mq_tags *tags = hctx->tags;
> 	struct rq_iter_data data = {
> 		.hctx	= hctx,
> 	};
> 
> 	blk_mq_all_tag_iter(tags, blk_mq_has_request, &data);
> 	return data.has_rq;
> }
> 
> 
> static inline bool blk_mq_last_cpu_in_hctx(unsigned int cpu,
> 		struct blk_mq_hw_ctx *hctx)
> {
> @@ -2386,6 +2398,8 @@ static int blk_mq_hctx_notify_offline(unsigned int
> cpu, struct hlist_node *node)
> 	if (percpu_ref_tryget(&hctx->queue->q_usage_counter)) {
> 		while (blk_mq_hctx_has_requests(hctx))
> 			msleep(5);
> +		if (blk_mq_hctx_has_requests2(hctx))
> +			pr_err("%s still has driver tag - but how? hctx->sched_tags=%pS\n",
> __func__, hctx->sched_tags);
> 		percpu_ref_put(&hctx->queue->q_usage_counter);
> 	} else {
> 
> And I get this:
> 
> [  366.839260] CPU46: shutdown
> [  366.842052] psci: CPU46 killed (polled 0 ms)
> [  366.910923] blk_mq_hctx_notify_offline still has driver tag - but how?
> hctx->sched_tags=0xffff041fa59e0d00
> [  366.920870] irq_migrate_all_off_this_cpu: 5 callbacks suppressed
> [  366.920871] IRQ 113: no longer affine to CPU47
> [  366.931299] IRQ 116: no longer affine to CPU47
> [  366.935734] IRQ 119: no longer affine to CPU47
> [  366.940169] IRQ 123: no longer affine to CPU47
> [  366.944607] irq_shutdown irq144
> [  366.947783] IRQ 330: no longer affine to CPU47
> [  366.952217] IRQ 333: no longer affine to CPU47
> [  366.956651] IRQ 336: no longer affine to CPU47
> [  366.961085] IRQ 339: no longer affine to CPU47
> [  366.965519] IRQ 342: no longer affine to CPU47
> [  366.969953] IRQ 345: no longer affine to CPU47
> [  366.974475] CPU47: shutdown
> [  366.977273] psci: CPU47 killed (polled 0 ms)
> [  367.063066] CPU48: shutdown
> [  367.065857] psci: CPU48 killed (polled 0 ms)
> [  367.107071] CPU49: shutdown
> [  367.109858] psci: CPU49 killed (polled 0 ms)
> [  367.147057] CPU50: shutdown
> [  367.149843] psci: CPU50 killed (polled 0 ms)
> [  367.175006] irq_shutdown irq145
> [  367.178272] CPU51: shutdown
> [  367.181070] psci: CPU51 killed (polled 0 ms)
> [  367.223067] CPU52: shutdown
> [  367.225856] psci: CPU52 killed (polled 0 ms)
> [  367.267024] CPU53: shutdown
> [  367.269821] psci: CPU53 killed (polled 0 ms)
> [  367.303004] CPU54: shutdown
> [  367.305792] psci: CPU54 killed (polled 0 ms)
> [  367.342948] irq_shutdown irq146
> [  367.346169] CPU55: shutdown
> [  367.348965] psci: CPU55 killed (polled 0 ms)
> [  367.387004] CPU56: shutdown][16.9%][r=869MiB/s,w=0KiB/s][r=223k,w=0
> IOPS][eta 04m:59s]
> [  367.389791] psci: CPU56 killed (polled 0 ms)
> [  367.419017] CPU57: shutdown
> [  367.421804] psci: CPU57 killed (polled 0 ms)
> [  367.447010] CPU58: shutdown
> [  367.449796] psci: CPU58 killed (polled 0 ms)
> long sleep 2
> [  398.070707] scsi_times_out req=0xffff041fa1a89c00iB/s][r=0,w=0 IOPS][eta
> 04m:29s]
> 
> I thought that if all the sched tags are put, then we should have no driver
> tag for that same hctx, right? That seems to coincide with the timeout (30
> seconds later)

That is weird, if there is driver tag found, that means the request is
in-flight and can't be completed by HW. I assume you have integrated
global host tags patch in your test, and suggest you to double check
hisi_sas's queue mapping which has to be exactly same with blk-mq's
mapping.

> 
> > 
> > If yes, can you collect debugfs log after the timeout is triggered?
> 
> Same limitation as before - once SCSI timeout happens, SCSI error handling
> kicks in and the shost no longer accepts commands, and, since that same
> shost provides rootfs, becomes unresponsive. But I can try.

Just wondering why not install two disks in your test machine, :-)

Thanks,
Ming