Re: [RFC PATCH V4 2/2] scsi: core: don't limit per-LUN queue depth for SSD

Ming Lei <ming.lei@xxxxxxxxxx> · Thu, 24 Oct 2019 09:09:12 +0800

On Wed, Oct 23, 2019 at 01:16:48PM +0530, Kashyap Desai wrote:
> V4 2/2] scsi: core: don't limit per-LUN queue depth
> > for SSD
> >
> > On Fri, Oct 18, 2019 at 12:00:07AM +0530, Kashyap Desai wrote:
> > > > On 10/9/19 2:32 AM, Ming Lei wrote:
> > > > > @@ -354,7 +354,8 @@ void scsi_device_unbusy(struct scsi_device
> > > > > *sdev,
> > > > struct scsi_cmnd *cmd)
> > > > >   	if (starget->can_queue > 0)
> > > > >   		atomic_dec(&starget->target_busy);
> > > > >
> > > > > -	atomic_dec(&sdev->device_busy);
> > > > > +	if (!blk_queue_nonrot(sdev->request_queue))
> > > > > +		atomic_dec(&sdev->device_busy);
> > > > >   }
> > > > >
> > > >
> > > > Hi Ming,
> > > >
> > > > Does this patch impact the meaning of the queue_depth sysfs
> > > > attribute (see also sdev_store_queue_depth()) and also the queue
> > > > depth ramp up/down mechanism (see also
> > scsi_handle_queue_ramp_up())?
> > > > Have you considered to enable/disable busy tracking per LUN
> > > > depending on whether or not sdev-
> > > > >queue_depth < shost->can_queue?
> > > >
> > > > The megaraid and mpt3sas drivers read sdev->device_busy directly. Is
> > > > the current version of this patch compatible with these drivers?
> > >
> > > We need to know per scsi device outstanding in mpt3sas and
> > > megaraid_sas driver.
> >
> > Is the READ done in fast path or slow path? If it is on slow path, it
> should be
> > easy to do via blk_mq_in_flight_rw().
> 
> READ is done in fast path.
> 
> >
> > > Can we get supporting API from block layer (through SML)  ? something
> > > similar to "atomic_read(&hctx->nr_active)" which can be derived from
> > > sdev->request_queue->hctx ?
> > > At least for those driver which is nr_hw_queue = 1, it will be useful
> > > and we can avoid sdev->device_busy dependency.
> >
> > If you mean to add new atomic counter, we just move the .device_busy
> into
> > blk-mq, that can become new bottleneck.
> 
> How about below ? We define and use below API instead of
> "atomic_read(&scp->device->device_busy) >" and it is giving expected
> value. I have not captured performance impact on max IOPs profile.
> 
> Inline unsigned long sdev_nr_inflight_request(struct request_queue *q)
> {
>         struct blk_mq_hw_ctx *hctx;
>         unsigned long nr_requests = 0;
>         int i;
> 
>         queue_for_each_hw_ctx(q, hctx, i)
>                 nr_requests += atomic_read(&hctx->nr_active);
> 
>         return nr_requests;
> }

There is still difference between above and .device_busy in case of
none, because .nr_active is accounted actually when allocating the request
instead of getting driver tag(or before calling .queue_rq).

Also the above only works in case that there are more than one active LUNs.

If you don't need it in case of single LUN AND don't care the difference
in case of none, the above API looks fine.

Thanks,
Ming