On Thu, Jun 04, 2020 at 12:50:32PM +0100, John Garry wrote: > On 04/06/2020 12:26, Ming Lei wrote: > > > And another: > > > > > > 51.336698] CPU53: shutdown > > > [ 251.336737] run queue from wrong CPU 0, hctx active > > > [ 251.339545] run queue from wrong CPU 0, hctx active > > > [ 251.344695] psci: CPU53 killed (polled 0 ms) > > > [ 251.349302] run queue from wrong CPU 0, hctx active > > > [ 251.358447] run queue from wrong CPU 0, hctx active > > > [ 251.363336] run queue from wrong CPU 0, hctx active > > > [ 251.368222] run queue from wrong CPU 0, hctx active > > > [ 251.373108] run queue from wrong CPU 0, hctx active > > > [ 252.488831] CPU54: shutdown=1094MiB/s,w=0KiB/s][r=280k,w=0 IOPS][eta > > > 00m:04s] > > > [ 252.491629] psci: CPU54 killed (polled 0 ms) > > > [ 252.536533] irq_shutdown irq146 > > > [ 252.540007] CPU55: shutdown > > > [ 252.543273] psci: CPU55 killed (polled 0 ms) > > > [ 252.656656] CPU56: shutdown > > > [ 252.659446] psci: CPU56 killed (polled 0 ms) > > > [ 252.704576] CPU57: shutdown=947MiB/s,w=0KiB/s][r=242k,w=0 IOPS][eta > > > 00m:02s] > > > [ 252.707369] psci: CPU57 killed (polled 0 ms) > > > [ 254.352646] CPU58: shutdown=677MiB/s,w=0KiB/s][r=173k,w=0 IOPS][eta > > > 00m:02s] > > > [ 254.355442] psci: CPU58 killed (polled 0 ms) > > > long sleep 1 > > > [ 279.288227] scsi_times_out req=0xffff041fa10b6600[r=0,w=0 IOPS][eta > > > 04m:37s] > > > [ 279.320281] sas: Enter sas_scsi_recover_host busy: 1 failed: 1 > > > [ 279.326144] sas: trying to find task 0x00000000e8dca422 > > > [ 279.331379] sas: sas_scsi_find_task: aborting task 0x00000000e8dca422 > > > > > > none scheduler seems ok. > > Can you reproduce it by just applying the patch of 'blk-mq: don't fail driver tag > > allocation because of inactive hctx'? > > That's your patch - ok, I can try. > > So I added some debug for when using Christoph's: > > static bool blk_mq_hctx_has_requests2(struct blk_mq_hw_ctx *hctx) > { > struct blk_mq_tags *tags = hctx->tags; > struct rq_iter_data data = { > .hctx = hctx, > }; > > blk_mq_all_tag_iter(tags, blk_mq_has_request, &data); > return data.has_rq; > } > > > static inline bool blk_mq_last_cpu_in_hctx(unsigned int cpu, > struct blk_mq_hw_ctx *hctx) > { > @@ -2386,6 +2398,8 @@ static int blk_mq_hctx_notify_offline(unsigned int > cpu, struct hlist_node *node) > if (percpu_ref_tryget(&hctx->queue->q_usage_counter)) { > while (blk_mq_hctx_has_requests(hctx)) > msleep(5); > + if (blk_mq_hctx_has_requests2(hctx)) > + pr_err("%s still has driver tag - but how? hctx->sched_tags=%pS\n", > __func__, hctx->sched_tags); > percpu_ref_put(&hctx->queue->q_usage_counter); > } else { > > And I get this: > > [ 366.839260] CPU46: shutdown > [ 366.842052] psci: CPU46 killed (polled 0 ms) > [ 366.910923] blk_mq_hctx_notify_offline still has driver tag - but how? > hctx->sched_tags=0xffff041fa59e0d00 > [ 366.920870] irq_migrate_all_off_this_cpu: 5 callbacks suppressed > [ 366.920871] IRQ 113: no longer affine to CPU47 > [ 366.931299] IRQ 116: no longer affine to CPU47 > [ 366.935734] IRQ 119: no longer affine to CPU47 > [ 366.940169] IRQ 123: no longer affine to CPU47 > [ 366.944607] irq_shutdown irq144 > [ 366.947783] IRQ 330: no longer affine to CPU47 > [ 366.952217] IRQ 333: no longer affine to CPU47 > [ 366.956651] IRQ 336: no longer affine to CPU47 > [ 366.961085] IRQ 339: no longer affine to CPU47 > [ 366.965519] IRQ 342: no longer affine to CPU47 > [ 366.969953] IRQ 345: no longer affine to CPU47 > [ 366.974475] CPU47: shutdown > [ 366.977273] psci: CPU47 killed (polled 0 ms) > [ 367.063066] CPU48: shutdown > [ 367.065857] psci: CPU48 killed (polled 0 ms) > [ 367.107071] CPU49: shutdown > [ 367.109858] psci: CPU49 killed (polled 0 ms) > [ 367.147057] CPU50: shutdown > [ 367.149843] psci: CPU50 killed (polled 0 ms) > [ 367.175006] irq_shutdown irq145 > [ 367.178272] CPU51: shutdown > [ 367.181070] psci: CPU51 killed (polled 0 ms) > [ 367.223067] CPU52: shutdown > [ 367.225856] psci: CPU52 killed (polled 0 ms) > [ 367.267024] CPU53: shutdown > [ 367.269821] psci: CPU53 killed (polled 0 ms) > [ 367.303004] CPU54: shutdown > [ 367.305792] psci: CPU54 killed (polled 0 ms) > [ 367.342948] irq_shutdown irq146 > [ 367.346169] CPU55: shutdown > [ 367.348965] psci: CPU55 killed (polled 0 ms) > [ 367.387004] CPU56: shutdown][16.9%][r=869MiB/s,w=0KiB/s][r=223k,w=0 > IOPS][eta 04m:59s] > [ 367.389791] psci: CPU56 killed (polled 0 ms) > [ 367.419017] CPU57: shutdown > [ 367.421804] psci: CPU57 killed (polled 0 ms) > [ 367.447010] CPU58: shutdown > [ 367.449796] psci: CPU58 killed (polled 0 ms) > long sleep 2 > [ 398.070707] scsi_times_out req=0xffff041fa1a89c00iB/s][r=0,w=0 IOPS][eta > 04m:29s] > > I thought that if all the sched tags are put, then we should have no driver > tag for that same hctx, right? That seems to coincide with the timeout (30 > seconds later) That is weird, if there is driver tag found, that means the request is in-flight and can't be completed by HW. I assume you have integrated global host tags patch in your test, and suggest you to double check hisi_sas's queue mapping which has to be exactly same with blk-mq's mapping. > > > > > If yes, can you collect debugfs log after the timeout is triggered? > > Same limitation as before - once SCSI timeout happens, SCSI error handling > kicks in and the shost no longer accepts commands, and, since that same > shost provides rootfs, becomes unresponsive. But I can try. Just wondering why not install two disks in your test machine, :-) Thanks, Ming