Re: [PATCH] ufs: Increase the usable queue depth

Can Guo <cang@xxxxxxxxxxxxxx> · Fri, 14 May 2021 12:47:38 +0800

On 2021-05-14 12:24, Can Guo wrote:
On 2021-05-14 12:19, Bart Van Assche wrote:
On 5/13/21 9:04 PM, Can Guo wrote:
Hi Bart,

On 2021-05-14 00:49, Bart Van Assche wrote:
With the current implementation of the UFS driver active_queues is 1
instead of 0 if all UFS request queues are idle. That causes
hctx_may_queue() to divide the queue depth by 2 when queueing a 
request
and hence reduces the usable queue depth.

This is interesting. When all UFS queues are idle, in 
hctx_may_queue(),
active_queues reads 1 (users == 1, depth == 32), where is it divided 
by 2?

static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
                                 struct sbitmap_queue *bt)
{
       unsigned int depth, users;

....
               users = atomic_read(&hctx->tags->active_queues);
       }

       if (!users)
               return true;

       /*
        * Allow at least some tags
        */
       depth = max((bt->sb.depth + users - 1) / users, 4U);
       return __blk_mq_active_requests(hctx) < depth;
}

Hi Can,

If no I/O scheduler has been configured then the active_queues counter
is increased from inside blk_get_request() by blk_mq_tag_busy() before
hctx_may_queue() is called. So if active_queues == 1 when the UFS 
device
is idle, the active_queues counter will be increased to 2 if a request
is submitted to another request queue than hba->cmd_queue. This will
cause the hctx_may_queue() calls from inside __blk_mq_alloc_request()
and __blk_mq_get_driver_tag() to limit the queue depth to 32 / 2 = 16.

If an I/O scheduler has been configured then __blk_mq_get_driver_tag()
will be the first function to call blk_mq_tag_busy() while processing 
a
request. The hctx_may_queue() call in __blk_mq_get_driver_tag() will
limit the queue depth to 32 / 2 = 16 if an I/O scheduler has been
configured.

Bart.

Yes, I just figured out what you are saying from the commit message and
gave my reviewed-by tag. Thanks for the explanation and the fix.

Regards,
Can Guo.

We definitely need to have fix present on Android12-5.10,
because performance may be impacted without it...

Thanks,
Can Guo.