Re: scsi-mq performance check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/12/2015 15:19, Bart Van Assche wrote:
On 12/18/2015 04:08 PM, Hannes Reinecke wrote:
On 12/18/2015 03:58 PM, John Garry wrote:
Hi,

I have started to enable scsi-mq on the HiSilicon SAS driver.

Are there hints/checks I should use to make sure it is configured
correctly/optimally? In my initial testing I have seen some
performance improvements, but none like what I have seen in
presentations.

The whole thing is build around having symmetric submit and receive
queues, so that we can tack a send/receive queue pair to the same CPU.
With that we can ensure that we don't have any cache invalidation, as
the request is already in the cache for that CPU when the completion is
recieved. _And_ we can get rid of most spinlocks as other CPUs cannot
access our request.

So make sure to have the submit and receive queues properly done, and
ensure you don't have any global resources within your driver which
needs to be locked. Or move access to those resources out of the fast
path.

Hello John,

It's great news that you started looking into scsi-mq support :-) As
Hannes wrote, if the performance improvement is not as big as you
expected this could be caused e.g. by lock contention. Are you familiar
with the perf tool ? The perf tool can be a great help to verify whether
lock contention occurs and also which lock(s) cause it.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Thanks for the replies.

One of my main concerns is how we use a spinlock in our task exec function to prepare and deliver a frame to the hardware:
hisi_sas_task_exec()
{
    ...

    /* protect task_prep and start_delivery sequence */
    spin_lock_irqsave(&hisi_hba->lock, flags);
    rc = hisi_sas_task_prep(task, hisi_hba, is_tmf, tmf, &pass);
    ...
    hisi_hba->hw->start_delivery(hisi_hba);
    spin_unlock_irqrestore(&hisi_hba->lock, flags);

    ...
}

We have to lock due to how we reserve a slot in the delivery queue. We are looking to optimise this, but it's not straightforward.

Perf is a good strategy, but, to be honest, I have not spent a lot of time looking at this so I'm looking for low hanging fruit initially.

FYI, our hardware does have the same number of delivery and completion queues (32), and 16 cores. One thing to note is that a command which was sent on queue x is not quaranteed to complete on queue y.

cheers,


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux