Re: Blk-mq/scsi-mq Tuning

Mike Christie <michaelc@xxxxxxxxxxx> · Fri, 30 Oct 2015 15:15:07 -0500

On 10/30/2015 10:00 AM, Hannes Reinecke wrote:
> On 10/30/2015 03:12 PM, Chad Dupuis wrote:
>>
>>
>> On Fri, 30 Oct 2015, Hannes Reinecke wrote:
>>
>>> On 10/30/2015 02:25 PM, Chad Dupuis wrote:
>>>>
>>>>
>>>> On Fri, 30 Oct 2015, Hannes Reinecke wrote:
>>>>
>>>>> On 10/28/2015 09:11 PM, Chad Dupuis wrote:
>>>>>> Hi Folks,
>>>>>>
>>>>>> We¹ve begun to explore blk-mq and scsi-mq and wanted to know if there
>>>>>> were
>>>>>> any best practices in terms of block layer settings.  We¹re looking
>>>>>> specifically at the FCoE and iSCSI protocols.
>>>>>>
>>>>>> A little background on the queues in our hardware first: we have a per
>>>>>> connection transmit queue and multiple, global receive queues.  The
>>>>>> transmit queues are not pegged to a particular CPU.  The receive
>>>>>> queues
>>>>>> are pegged to the first N CPUs where N is the number of receive
>>>>>> queues.
>>>>>> We set the nr_hw_queues in the scsi_host_template to N as well.
>>>>>>
>>>>> Weelll ... I think you'll run into issues here.
>>>>> The whole point of the multiqueue implementation is that you can tag
>>>>> the
>>>>> submission _and_ completion queue to a single CPU, thereby eliminating
>>>>> locking.
>>>>> If you only peg the completion queue to a CPU you'll still have
>>>>> contention on the submission queue, needing to take locks etc.
>>>>>
>>>>> Plus you will _inevitably_ incur cache misses, as the completion will
>>>>> basically never occur on the same CPU which did the submissoin.
>>>>> Hence the context needs to be bounced to the CPU holding the completion
>>>>> queue, or you'll need to do a IPI to inform the submitting CPU.
>>>>> But if you do that you're essentially doing single-queue submission,
>>>>> so I doubt we're seeing that great improvements.
>>>>
>>>> This was why I was asking if there was a blk-mq API to be able to set
>>>> CPU affinity for the hardware context queues so I could steer the
>>>> submissions to the CPUs that my receive queues are on (even if they are
>>>> allowed to float).
>>>>
>>> But what would that achieve?
>>> Each of the hardware context queues would still having to use the
>>> same submission queue, so you'd have to have some serialisation
>>> with spinlocks et.al. during submission. Which is what blk-mq
>>> tries to avoid.
>>> Am I wrong?
>>
>> Sadly, no I believe you're correct. So essentially the upshot seems to
>> be if you can have a 1x1 request:response queue then sticking with the
>> older queuecommand method is better?
>>
> Hmm; you might be getting some performance improvements as the
> submission path from the blocklayer down is more efficient, but in
> your case the positive effects might be eliminated by reducing the
> number of receive queues.
> But then you never know until you try :-)
> 
> The alternative would indeed be to move to MC/S with block-mq; that
> should give you some benefits as you'd be able to utilize several queues.
> I have actually discussed that with Emulex; moving to MC/S in the iSCSI
> stack might indeed be viable when using blk-mq. It would be a rather
> good match with the existing blk-mq implementation, and most of the
> implementation would be in the iSCSI stack, reducing the burden on the
> driver vendors :-)
> 

I think the mulit session mq stuff would actually just work too. It was
done with hw iscsi in mind.

MC/s might be nicer in their case though. For qla4xxx type of cards,
would all the MC/S stuff be done in firmware, so all you need is a
common interface to expose the connection details and then some common
code to map them to hw queues?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html