On Fri, Oct 25, 2019 at 03:34:16PM +0530, Kashyap Desai wrote: > > > > > > > > > > > > Can we get supporting API from block layer (through SML) ? > > > > > something similar to "atomic_read(&hctx->nr_active)" which can be > > > > > derived from > > > > > sdev->request_queue->hctx ? > > > > > At least for those driver which is nr_hw_queue = 1, it will be > > > > > useful and we can avoid sdev->device_busy dependency. > > > > > > > > If you mean to add new atomic counter, we just move the .device_busy > > > into > > > > blk-mq, that can become new bottleneck. > > > > > > How about below ? We define and use below API instead of > > > "atomic_read(&scp->device->device_busy) >" and it is giving expected > > > value. I have not captured performance impact on max IOPs profile. > > > > > > Inline unsigned long sdev_nr_inflight_request(struct request_queue *q) > > > { > > > struct blk_mq_hw_ctx *hctx; > > > unsigned long nr_requests = 0; > > > int i; > > > > > > queue_for_each_hw_ctx(q, hctx, i) > > > nr_requests += atomic_read(&hctx->nr_active); > > > > > > return nr_requests; > > > } > > > > There is still difference between above and .device_busy in case of > none, > > because .nr_active is accounted actually when allocating the request > instead > > of getting driver tag(or before calling .queue_rq). > > > This will be fine as long as we get outstanding from allocation time > itself. Fine, but keep that in mind. > > > > Also the above only works in case that there are more than one active > LUNs. > > I am not able to understand this part. We have tested on setup which has > only one active LUN and it works. Can you help me to understand this part > ? Please see blk_mq_rq_ctx_init(): if (data->hctx->flags & BLK_MQ_F_TAG_SHARED) { rq_flags = RQF_MQ_INFLIGHT; atomic_inc(&data->hctx->nr_active); } blk_mq_init_allocated_queue blk_mq_add_queue_tag_set blk_mq_update_tag_set_depth(ture) queue_set_hctx_shared(q, shared) thanks, Ming