On 12/7/20 3:56 AM, Hannes Reinecke wrote: > On 12/4/20 3:26 PM, Brian King wrote: >> On 12/2/20 11:27 AM, Tyrel Datwyler wrote: >>> On 12/2/20 7:14 AM, Brian King wrote: >>>> On 12/1/20 6:53 PM, Tyrel Datwyler wrote: >>>>> Introduce several new vhost fields for managing MQ state of the adapter >>>>> as well as initial defaults for MQ enablement. >>>>> >>>>> Signed-off-by: Tyrel Datwyler <tyreld@xxxxxxxxxxxxx> >>>>> --- >>>>> drivers/scsi/ibmvscsi/ibmvfc.c | 9 ++++++++- >>>>> drivers/scsi/ibmvscsi/ibmvfc.h | 13 +++++++++++-- >>>>> 2 files changed, 19 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c >>>>> index 42e4d35e0d35..f1d677a7423d 100644 >>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c >>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c >>>>> @@ -5161,12 +5161,13 @@ static int ibmvfc_probe(struct vio_dev *vdev, const >>>>> struct vio_device_id *id) >>>>> } >>>>> shost->transportt = ibmvfc_transport_template; >>>>> - shost->can_queue = max_requests; >>>>> + shost->can_queue = (max_requests / IBMVFC_SCSI_HW_QUEUES); >>>> >>>> This doesn't look right. can_queue is the SCSI host queue depth, not the MQ >>>> queue depth. >>> >>> Our max_requests is the total number commands allowed across all queues. From >>> what I understand is can_queue is the total number of commands in flight allowed >>> for each hw queue. >>> >>> /* >>> * In scsi-mq mode, the number of hardware queues supported by the LLD. >>> * >>> * Note: it is assumed that each hardware queue has a queue depth of >>> * can_queue. In other words, the total queue depth per host >>> * is nr_hw_queues * can_queue. However, for when host_tagset is set, >>> * the total queue depth is can_queue. >>> */ >>> >>> We currently don't use the host wide shared tagset. >> >> Ok. I missed that bit... In that case, since we allocate by default only 100 >> event structs. If we slice that across IBMVFC_SCSI_HW_QUEUES (16) queues, then >> we end up with only about 6 commands that can be outstanding per queue, >> which is going to really hurt performance... I'd suggest bumping up >> IBMVFC_MAX_REQUESTS_DEFAULT from 100 to 1000 as a starting point. >> > Before doing that I'd rather use the host-wide shared tagset. > Increasing the number of requests will increase the memory footprint of the > driver (as each request will be statically allocated). > In the case where we use host-wide how do I determine the queue depth per hardware queue? Is is hypothetically can_queue or is it (can_queue / nr_hw_queues)? We want to allocate an event pool per-queue which made sense without host-wide tags since the queue depth per hw queue is exactly can_queue. -Tyrel