On 11/25/2019 10:28 AM, Ewan D. Milne wrote:
On Fri, 2019-11-22 at 10:14 -0800, Bart Van Assche wrote:
Hi Ming,
Thanks for having shared these numbers. I think this is very useful
information. Do these results show the performance drop that happens if
/sys/block/.../device/queue_depth exceeds .can_queue? What I am
wondering about is how important these results are in the context of
this discussion. Are there any modern SCSI devices for which a SCSI LLD
sets scsi_host->can_queue and scsi_host->cmd_per_lun such that the
device responds with BUSY? What surprised me is that only three SCSI
LLDs call scsi_track_queue_full() (mptsas, bfa, esp_scsi). Does that
mean that BUSY responses from a SCSI device or HBA are rare?
Some FC HBAs end up returning busy from ->queuecommand() but I think
this is more commonly due to there being and issue with the rport rather
than the device.
-Ewan
True - but I would assume busy from queuecommand() is different from
BUSY/QUEUE_FULL via a SCSI response.
Adapter queuecommand busy's can be for out-of-resource limits in the
driver - such as I_T io count limits enforced by the driver are reached,
or if some other adapter resource limit is reached as well. Canqueue
covers most of those - but we sometimes overcommit the adapter with a
canqueue on physical port as well as per npiv ports, or scsi and nvme on
the same port.
Going back to Bart's question - with SANS and multiple initiators
sharing a target and lots of luns on that target, it's very common to
hit bursty conditions where the target may reply with QUEUE_FULL. Many
arrays provide think tuning guides on how to set up values on multiple
hosts, but it's mainly to help the target avoid being completely overrun
as some didn't do so well. In the end, it's very hard to predict
multi-initiator load and in a lot of cases, things are usually left a
bit overcommitted as the performance downside otherwise is significant.
-- james