Hi Ming, > > kprobe output showing RQF_MQ_INFLIGHT bit is not cleared before > > __blk_mq_free_request being called. > RQF_MQ_INFLIGHT won't be cleared when the request is freed normally > from blk_mq_free_request(). Yes you are correct, maybe I should capture both rq->rq_flags and rq->state so we know for sure if either of blk_mq_free_request or __blk_mq_put_driver_tag was being called before hitting __blk_mq_free_request. > > b'__blk_mq_free_request+0x1 [kernel]' > > b'bt_iter+0x50 [kernel]' > > b'blk_mq_queue_tag_busy_iter+0x318 [kernel]' > > b'blk_mq_timeout_work+0x7c [kernel]' > > b'process_one_work+0x1c4 [kernel]' > > b'worker_thread+0x4d [kernel]' > > b'kthread+0xe6 [kernel]' > > b'ret_from_fork+0x1f [kernel]' > If __blk_mq_free_request() is called from timeout, that means this > request has been freed by blk_mq_free_request() already, so __blk_mq_dec_active_requests > should have been run. We are also seeing a different call stack that could also potentially by-pass __blk_mq_dec_active_requests. Do you think they could be caused by the same underlying issue. 1976 2000 collectd __blk_mq_free_request rq_flags 0x620c0 in-flight 1 b'__blk_mq_free_request+0x1 [kernel]' b'bt_iter+0x50 [kernel]' b'blk_mq_queue_tag_busy_iter+0x318 [kernel]' b'blk_mq_in_flight+0x35 [kernel]' b'diskstats_show+0x205 [kernel]' b'seq_read_iter+0x11f [kernel]' b'proc_req_read_iter+0x4a [kernel]' b'vfs_read+0x239 [kernel]' b'ksys_read+0xb [kernel]' b'do_syscall_64+0x58 [kernel]' b'entry_SYSCALL_64_after_hwframe+0x63 [kernel]' > However, one case is that __blk_mq_dec_active_requests isn't called in > blk_mq_end_request_batch, so maybe your driver is nvme with multiple > NSs, so can you try the following patch? Yes, we are using nvme driver with multiple NSs. I can test this patch and will update you on the results. I'm just curious shouldn't the counter be subtracted via __blk_mq_sub_active_requests when blk_mq_flush_tag_batch is invoked in that case. Then this would result in double counting, is that correct. Thanks, Tian