Re: [PATCH 4/4] scsi: core: don't limit per-LUN queue depth for SSD

Ming Lei <ming.lei@xxxxxxxxxx> · Sat, 23 Nov 2019 06:00:31 +0800

On Fri, Nov 22, 2019 at 10:14:51AM -0800, Bart Van Assche wrote:
> On 11/22/19 12:09 AM, Ming Lei wrote:
> > On Thu, Nov 21, 2019 at 04:45:48PM +0100, Hannes Reinecke wrote:
> > > On 11/21/19 1:53 AM, Ming Lei wrote:
> > > > On Wed, Nov 20, 2019 at 11:05:24AM +0100, Hannes Reinecke wrote:
> > > > > I would far prefer if we could delegate any queueing decision to the
> > > > > elevators, and completely drop the device_busy flag for all devices.
> > > > 
> > > > If you drop it, you may create big sequential IO performance drop
> > > > on HDD., that is why this patch only bypasses sdev->queue_depth on
> > > > SSD. NVMe bypasses it because no one uses HDD. via NVMe.
> > > > 
> > > I still wonder how much performance drop we actually see; what seems to
> > > happen is that device_busy just arbitrary pushes back to the block
> > > layer, giving it more time to do merging.
> > > I do think we can do better then that...
> > 
> > For example, running the following script[1] on 4-core VM:
> > 
> > ------------------------------------------
> >                      | QD:255    | QD: 32  |
> > ------------------------------------------
> > fio read throughput | 825MB/s   | 1432MB/s|
> > ------------------------------------------
> > 
> > [ ... ]
> 
> Hi Ming,
> 
> Thanks for having shared these numbers. I think this is very useful
> information. Do these results show the performance drop that happens if
> /sys/block/.../device/queue_depth exceeds .can_queue? What I am wondering

The above test just shows that IO merge plays important role here, and
one important point for triggering IO merge is that .get_budget returns
false.

If sdev->queue_depth is too big, .get_budget may never return false.

That is why this patch just bypasses .device_busy for SSD.

> about is how important these results are in the context of this discussion.
> Are there any modern SCSI devices for which a SCSI LLD sets
> scsi_host->can_queue and scsi_host->cmd_per_lun such that the device
> responds with BUSY? What surprised me is that only three SCSI LLDs call

There are many such HBAs, for which sdev->queue_depth is smaller than
.can_queue, especially in case of small number of LUNs.

> scsi_track_queue_full() (mptsas, bfa, esp_scsi). Does that mean that BUSY
> responses from a SCSI device or HBA are rare?

It is only true for some HBAs.

thanks,
Ming