On Fri, May 31, 2019 at 10:42:12AM +0100, John Garry wrote: > > > > > > > > > > -static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba, > > > > > - struct scsi_cmnd *scsi_cmnd) > > > > > +static int hisi_sas_slot_index_alloc(struct hisi_hba *hisi_hba) > > > > > { > > > > > int index; > > > > > void *bitmap = hisi_hba->slot_index_tags; > > > > > unsigned long flags; > > > > > > > > > > - if (scsi_cmnd) > > > > > - return scsi_cmnd->request->tag; > > > > > - > > > > > spin_lock_irqsave(&hisi_hba->lock, flags); > > > > > index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count, > > > > > hisi_hba->last_slot_index + 1); > > > > > > > > Then you switch to hisi_sas_slot_index_alloc() for allocating the unique > > > > tag via spin_lock & find_next_zero_bit(). Do you think this way is more > > > > efficient than blk-mq's sbitmap? > > These are not fast path, as used only for TMF, internal IO, etc. OK, looks I miss that. > > Having said that, hopefully we can move to scsi_{get,put}_reserved_cmd() > when available, so that the LLDD has to stop managing them. Agree. > > > > > > > > slot_index_alloc() is only used for commands which do _not_ have a tag > > > (eg internal commands), or for v2 hardware which has weird allocation rules. > > > > But this patch has switched to this allocation unconditionally for all commands: > > > > As Hannes said, v2 had a few bugs which meant that we had to make a specific > version of this function for that hw revision, cf. > slot_index_alloc_quirk_v2_hw(), and it cannot use request queue tag. > > But, indeed, we could consider sbitmap for v2 hw. I'm not sure if it would > help, considering the weird rules. > > > > - if (scsi_cmnd) > > > - return scsi_cmnd->request->tag; > > > - > > > > Otherwise duplicated slot can be used from different blk-mq hw queue. > > > > > > > > > The worsen thing is that V3's actual max queue depth is (4096 - 96), but > > > > this patch claims that the device can support (4096 - 96) * 32 command > > > > slots > > To be clear about the hw, the hw supports max 4096 command tags and has 16 Is 4096 the max allowed host-wide command tags? Or per-queue's max commands tags? If it is per-queue's max command tags, V3 should be real MQ hardware, otherwise it is still host wide. Otherwise, Hannes's patch can't work because tag from different hw queue can be same. > hw queues. The hw queue depth is configurable by software, and we configure > it at 4096 per queue - same as max command tags - this is to support > possibility of all commands using the same queue simultaneously. Suppose 4096 is the host-wide command tags: As Hannes's patch changed to allow each blk-mq hw queue to accept 4096 commands, there will be big contention in driver, given now the actual .can_queue becomes 4096 * 32, and how to avoid the same tag from different hw queue? Thanks, Ming