On Mon, Nov 11, 2019 at 02:02:27PM +0000, John Garry wrote: > On 27/10/2019 08:19, Ming Lei wrote: > > > .this_id = -1, > > > @@ -3265,8 +3300,14 @@ hisi_sas_v3_probe(struct pci_dev *pdev, const struct pci_device_id *id) > > > shost->max_lun = ~0; > > > shost->max_channel = 1; > > > shost->max_cmd_len = 16; > > > - shost->can_queue = HISI_SAS_UNRESERVED_IPTT; > > > - shost->cmd_per_lun = HISI_SAS_UNRESERVED_IPTT; > > > + > > Hi Ming, > > I mentioned in the thread "blk-mq: improvement on handling IO during CPU > hotplug" that I was using this series to test that patchset. > > So just with this patchset (and without yours), I get what looks like some > IO errors in the LLDD. The error is an underflow error. I can't figure out > what is the cause. Can you post the error log? Or interpret the 'underflow error' from hisi sas or scsi viewpoint? > > I'm wondering if the SCSI command is getting corrupted someway. Why do you think the command is corrupted? > > > > + if (expose_mq_experimental) { > > > + shost->can_queue = HISI_SAS_MAX_COMMANDS; > > > + shost->cmd_per_lun = HISI_SAS_MAX_COMMANDS; > > The above is contradictory with current 'nr_hw_queues''s meaning, > > see commit on Scsi_Host.nr_hw_queues. > > > > Right, so I am generating the hostwide tag in the LLDD. And the Scsi > host-wide host_busy counter should ensure that we don't pump too much IO to > the HBA. Even without the host-wide host_busy, your approach should work if you build the hisi sas tag correctly(uniquely), just not efficiently. I'd suggest you to collect trace and observe if request with expected hisi sas tag is sent to hardware. BTW, the patch of 'scsi: core: avoid host-wide host_busy counter for scsi_mq' will be merged to v5.5 if everything is fine. https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=5.5/scsi-queue&id=6eb045e092efefafc6687409a6fa6d1dabf0fb69 Thanks, Ming