Re: [PATCH] ufs: Increase the usable queue depth

Bart Van Assche <bvanassche@xxxxxxx> · Thu, 13 May 2021 21:19:27 -0700

On 5/13/21 9:04 PM, Can Guo wrote:
> Hi Bart,
> 
> On 2021-05-14 00:49, Bart Van Assche wrote:
>> With the current implementation of the UFS driver active_queues is 1
>> instead of 0 if all UFS request queues are idle. That causes
>> hctx_may_queue() to divide the queue depth by 2 when queueing a request
>> and hence reduces the usable queue depth.
> 
> This is interesting. When all UFS queues are idle, in hctx_may_queue(),
> active_queues reads 1 (users == 1, depth == 32), where is it divided by 2?
> 
> static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
>                                  struct sbitmap_queue *bt)
> {
>        unsigned int depth, users;
> 
> ....
>                users = atomic_read(&hctx->tags->active_queues);
>        }
> 
>        if (!users)
>                return true;
> 
>        /*
>         * Allow at least some tags
>         */
>        depth = max((bt->sb.depth + users - 1) / users, 4U);
>        return __blk_mq_active_requests(hctx) < depth;
> }

Hi Can,

If no I/O scheduler has been configured then the active_queues counter
is increased from inside blk_get_request() by blk_mq_tag_busy() before
hctx_may_queue() is called. So if active_queues == 1 when the UFS device
is idle, the active_queues counter will be increased to 2 if a request
is submitted to another request queue than hba->cmd_queue. This will
cause the hctx_may_queue() calls from inside __blk_mq_alloc_request()
and __blk_mq_get_driver_tag() to limit the queue depth to 32 / 2 = 16.

If an I/O scheduler has been configured then __blk_mq_get_driver_tag()
will be the first function to call blk_mq_tag_busy() while processing a
request. The hctx_may_queue() call in __blk_mq_get_driver_tag() will
limit the queue depth to 32 / 2 = 16 if an I/O scheduler has been
configured.

Bart.