On 9/1/20 12:36 AM, Ming Lei wrote: > Hi Jens, > > On Mon, Aug 31, 2020 at 09:42:05PM -0600, Jens Axboe wrote: >> On 8/31/20 7:18 PM, Ming Lei wrote: >>> On Mon, Aug 31, 2020 at 11:37 PM Gabriel Krisman Bertazi >>> <krisman@xxxxxxxxxxxxx> wrote: >>>> >>>> According to Documentation/block/stat.rst, inflight should not include >>>> I/O requests that are in the queue but not yet dispatched to the device, >>>> but blk-mq identifies as inflight any request that has a tag allocated, >>>> which, for queues without elevator, happens at request allocation time >>>> and before it is queued in the ctx (default case in blk_mq_submit_bio). >>>> >>>> A more precise approach would be to only consider requests with state >>>> MQ_RQ_IN_FLIGHT. >>>> >>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> >>>> --- >>>> block/blk-mq.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>> index 0015a1892153..997b3327eaa8 100644 >>>> --- a/block/blk-mq.c >>>> +++ b/block/blk-mq.c >>>> @@ -105,7 +105,7 @@ static bool blk_mq_check_inflight(struct blk_mq_hw_ctx *hctx, >>>> { >>>> struct mq_inflight *mi = priv; >>>> >>>> - if (rq->part == mi->part) >>>> + if (rq->part == mi->part && rq->state == MQ_RQ_IN_FLIGHT) >>>> mi->inflight[rq_data_dir(rq)]++; >>> >>> The fix looks fine. However, we have helper of >>> blk_mq_request_started() for this purpose. >> >> Why does it look fine? Did you see the older commit I referenced? I'm > > Looks my gmail inbox has problem, and I didn't see your referenced > commit. but I can see your reply just now in my redhat email box, > sorry for that. Ah ok, that explains it, just felt like it was being ignored. > BTW, commit 6131837b1de6 ("blk-mq: count allocated but not started requests > in iostats inflight") didn't does what it claimed. blk_mq_queue_tag_busy_iter() > iterates over driver tags, so for real io scheduler, blk_mq_check_inflight() > basically returns count of inflight request, instead of allocated request. > > Even worse, since commit 6131837b1de6 blk_mq_in_flight() behaves inconsistently > between q->elevator and !q->elevator. I agree, that is definitely a problem, and I've run into this internally at FB in fact. >> not saying the change is wrong per se, just that this is the behavior >> we've always had, and making this change would deviate from that. As >> Gabriel states in the follow up, it's either changing the documentation >> or the patch. > > Looks iostat doesn't use the 'inflight' count, so what is the > userspace's expectation on this counter? > > If it counts allocated request, it is easy for userspace to observe > different statistics if someone updates nr_requests via sysfs. I don't think there's necessarily an expectation, but if we change the scope then that might cause problems for folks. All of a sudden the iowait looks less, for example, and that can be hard to explain. That said, I do think the patch makes sense, obviously, since that's how I originally did the mq accounting. But it's hard to argue with history and expectations, and that is my worry. I don't want to apply this patch, then have to revert the behavior a few revisions later. And then we end up with different kernels that have different metrics, and that's somewhat of a mess. -- Jens Axboe