Hi, Ming
On 2020/3/16 23:30, Ming Lei wrote:
On Mon, Mar 16, 2020 at 08:26:35PM +0800, Yufen Yu wrote:
Ping and Cc to more expert in blk-mq.
On 2020/3/3 21:08, Yufen Yu wrote:
Our test robot reported a warning for refcount_dec trying to decrease
value '0'. The reason is that blk_mq_dispatch_rq_list() try to complete
the failed request from nbd driver, while the request have finished in
nbd timeout handle function. The race as following:
CPU1 CPU2
//req->ref = 1
blk_mq_dispatch_rq_list
nbd_queue_rq
nbd_handle_cmd
blk_mq_start_request
blk_mq_check_expired
//req->ref = 2
blk_mq_rq_timed_out
nbd_xmit_timeout
This shouldn't happen in reality, given rq->deadline is just updated
in blk_mq_start_request(), suppose you use the default 30 sec timeout.
How can the race be triggered in so short time? >
Could you explain a bit your test case?
In fact, this is reported by syzkaller. We have not actually test case.
But, I think nbd driver should not start request in case of failure. So fix it.
Thanks,
Yufen