On Thu, Jun 04, 2020 at 01:45:09PM +0100, John Garry wrote: > > > > That's your patch - ok, I can try. > > > > > I still get timeouts and sometimes the same driver tag message occurs: > > 1014.232417] run queue from wrong CPU 0, hctx active > [ 1014.237692] run queue from wrong CPU 0, hctx active > [ 1014.243014] run queue from wrong CPU 0, hctx active > [ 1014.248370] run queue from wrong CPU 0, hctx active > [ 1014.253725] run queue from wrong CPU 0, hctx active > [ 1014.259252] run queue from wrong CPU 0, hctx active > [ 1014.264492] run queue from wrong CPU 0, hctx active > [ 1014.269453] irq_shutdown irq146 > [ 1014.272752] CPU55: shutdown > [ 1014.275552] psci: CPU55 killed (polled 0 ms) > [ 1015.151530] CPU56: shutdownr=1621MiB/s,w=0KiB/s][r=415k,w=0 IOPS][eta > 00m:00s] > [ 1015.154322] psci: CPU56 killed (polled 0 ms) > [ 1015.184345] CPU57: shutdown > [ 1015.187143] psci: CPU57 killed (polled 0 ms) > [ 1015.223388] CPU58: shutdown > [ 1015.226174] psci: CPU58 killed (polled 0 ms) > long sleep 8 > [ 1045.234781] scsi_times_out req=0xffff041fa13e6300[r=0,w=0 IOPS][eta > 04m:30s] > > [...] > > > > > > > I thought that if all the sched tags are put, then we should have no driver > > > tag for that same hctx, right? That seems to coincide with the timeout (30 > > > seconds later) > > > > That is weird, if there is driver tag found, that means the request is > > in-flight and can't be completed by HW. > > In blk_mq_hctx_has_requests(), we iterate the sched tags (when > hctx->sched_tags is set). So can some requests not have a sched tag (even > for scheduler set for the queue)? Every request must have a scheduler tag in case of io scheduler. > > I assume you have integrated > > global host tags patch in your test, > > No, but the LLDD does not use request->tag - it generates its own. It isn't related with request->tag, what I meant is that you use out-of-tree patch to enable multiple hw queue on hisi_sas, you have to make the queue mapping correct, that said the exact queue mapping from blk-mq's mapping has to be used, which is built from managed interrupt affinity. Please collect the following log: 1) ./dump-io-irq-affinity $PCI_ID_OF_HBA http://people.redhat.com/minlei/tests/tools/dump-io-irq-affinity 2) ./dump-qmap /dev/sdN http://people.redhat.com/minlei/tests/tools/dump-qmap > > and suggest you to double check > > hisi_sas's queue mapping which has to be exactly same with blk-mq's > > mapping. > > > > scheduler=none is ok, so I am skeptical of a problem there. > > > > > > > > > > > > If yes, can you collect debugfs log after the timeout is triggered? > > > > > > Same limitation as before - once SCSI timeout happens, SCSI error handling > > > kicks in and the shost no longer accepts commands, and, since that same > > > shost provides rootfs, becomes unresponsive. But I can try. > > > > Just wondering why not install two disks in your test machine, :-) > > The shost becomes unresponsive for all disks. So I could try nfs, but I'm > not a fan :) Then it will take you extra effort in collecting log, and NFS root should have been quite easy to setup, :-) Thanks, Ming