On 5/13/21 9:04 PM, Can Guo wrote: > Hi Bart, > > On 2021-05-14 00:49, Bart Van Assche wrote: >> With the current implementation of the UFS driver active_queues is 1 >> instead of 0 if all UFS request queues are idle. That causes >> hctx_may_queue() to divide the queue depth by 2 when queueing a request >> and hence reduces the usable queue depth. > > This is interesting. When all UFS queues are idle, in hctx_may_queue(), > active_queues reads 1 (users == 1, depth == 32), where is it divided by 2? > > static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, > struct sbitmap_queue *bt) > { > unsigned int depth, users; > > .... > users = atomic_read(&hctx->tags->active_queues); > } > > if (!users) > return true; > > /* > * Allow at least some tags > */ > depth = max((bt->sb.depth + users - 1) / users, 4U); > return __blk_mq_active_requests(hctx) < depth; > } Hi Can, If no I/O scheduler has been configured then the active_queues counter is increased from inside blk_get_request() by blk_mq_tag_busy() before hctx_may_queue() is called. So if active_queues == 1 when the UFS device is idle, the active_queues counter will be increased to 2 if a request is submitted to another request queue than hba->cmd_queue. This will cause the hctx_may_queue() calls from inside __blk_mq_alloc_request() and __blk_mq_get_driver_tag() to limit the queue depth to 32 / 2 = 16. If an I/O scheduler has been configured then __blk_mq_get_driver_tag() will be the first function to call blk_mq_tag_busy() while processing a request. The hctx_may_queue() call in __blk_mq_get_driver_tag() will limit the queue depth to 32 / 2 = 16 if an I/O scheduler has been configured. Bart.