On 5/31/19 10:46 AM, Ming Lei wrote: > On Fri, May 31, 2019 at 10:32:04AM +0200, Hannes Reinecke wrote: >> On 5/31/19 10:21 AM, Ming Lei wrote: >>> On Fri, May 31, 2019 at 09:41:58AM +0200, Hannes Reinecke wrote: >>>> (Resending due to missing mailing list submission) >>>> >>>> Update v3 to support SCSI multiqueue. >>>> >>>> Signed-off-by: Hannes Reinecke <hare@xxxxxxxx> >>>> --- >>>> drivers/scsi/hisi_sas/hisi_sas.h | 1 - >>>> drivers/scsi/hisi_sas/hisi_sas_main.c | 45 +++++++++++++++++----------------- >>>> drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 44 +++++++++++---------------------- >>>> 3 files changed, 36 insertions(+), 54 deletions(-) >>>> >>>> diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h >>>> index fc87994b5d73..4b6f32f60689 100644 >>>> --- a/drivers/scsi/hisi_sas/hisi_sas.h >>>> +++ b/drivers/scsi/hisi_sas/hisi_sas.h >>>> @@ -378,7 +378,6 @@ struct hisi_hba { >>>> u32 intr_coal_count; /* Interrupt count to coalesce */ >>>> >>>> int cq_nvecs; >>>> - unsigned int *reply_map; >>>> >>>> /* debugfs memories */ >>>> u32 *debugfs_global_reg; >>>> diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c >>>> index 8a7feb8ed8d6..f4237c4754a4 100644 >>>> --- a/drivers/scsi/hisi_sas/hisi_sas_main.c >>>> +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c >>>> @@ -200,16 +200,12 @@ static void hisi_sas_slot_index_set(struct hisi_hba *hisi_hba, int slot_idx) >>>> set_bit(slot_idx, bitmap); >>>> } >>>> >>>> -static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba, >>>> - struct scsi_cmnd *scsi_cmnd) >>>> +static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba) >>>> { >>>> int index; >>>> void *bitmap = hisi_hba->slot_index_tags; >>>> unsigned long flags; >>>> >>>> - if (scsi_cmnd) >>>> - return scsi_cmnd->request->tag; >>>> - >>>> spin_lock_irqsave(&hisi_hba->lock, flags); >>>> index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count, >>>> hisi_hba->last_slot_index + 1); >>> >>> Then you switch to hisi_sas_slot_index_alloc() for allocating the unique >>> tag via spin_lock & find_next_zero_bit(). Do you think this way is more >>> efficient than blk-mq's sbitmap? >>> >> slot_index_alloc() is only used for commands which do _not_ have a tag >> (eg internal commands), or for v2 hardware which has weird allocation rules. > > But this patch has switched to this allocation unconditionally for all commands: > No: @@ -503,21 +513,10 @@ static int hisi_sas_task_prep(struct sas_task *task, if (hisi_hba->hw->slot_index_alloc) rc = hisi_hba->hw->slot_index_alloc(hisi_hba, device); - else { - struct scsi_cmnd *scsi_cmnd = NULL; - - if (task->uldd_task) { - struct ata_queued_cmd *qc; - - if (dev_is_sata(device)) { - qc = task->uldd_task; - scsi_cmnd = qc->scsicmd; - } else { - scsi_cmnd = task->uldd_task; - } - } - rc = hisi_sas_slot_index_alloc(hisi_hba, scsi_cmnd); - } + else if (blk_tag != (u32)-1) + rc = blk_mq_unique_tag_to_tag(blk_tag); + else + rc = hisi_sas_slot_index_alloc(hisi_hba); if (rc < 0) goto err_out_dif_dma_unmap; First we check for the 'slot_index_alloc()' callback to handle weird v2 allocation rules, _then_ we look for a tag, and only if we do _not_ have a tag we're using the bitmap. And the bitmap is already correctly sized, as otherwise we'd have a clash between internal and tagged I/O commands even now. >> - if (scsi_cmnd) >> - return scsi_cmnd->request->tag; >> - > > Otherwise duplicated slot can be used from different blk-mq hw queue. > >> >>> The worsen thing is that V3's actual max queue depth is (4096 - 96), but >>> this patch claims that the device can support (4096 - 96) * 32 command >>> slots, finally hisi_sas_slot_index_alloc() is used to respect the actual >>> max queue depth(4000). >>> >> Well, this patch is an RFC to demonstrate my idea. Of course the queue >> depth should be identical before and after the conversion. > > That is why I call it is hard to partition the hostwide tags to MQ. > It's not. The driver already sets aside a portion of tags for internal commands (check HISI_SAS_RESERVED_IPTT_CNT), so it is already effectively partitioned. >> >>> Big contention is caused on hisi_sas_slot_index_alloc(), meantime huge> memory is wasted for request pool. >>> >> See above. That allocation is only used if no blk tag is available. > > This patch switches the allocation for all commands. > See above. No. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg)