> I suspect this is due to we could expire a same request twice or even more. > For scsi mid-layer, it return BLK_EH_DONE from .timeout, in fact, the request is not > completed there, but just queue a delayed abort_work (HZ/100). If the blk_mq_timeout_work > runs again before the abort_work, the request will be timed out again, because there is not > any mark on it to identify this request has been timed out. > > Would please try the patch attached on to see whether this issue could be fixed ? > (this patch only works for scsi device currently) The patch isn't really going to work without a caller of your new __blk_mq_complete_request helper, is it? Either way the concept of doing error handling without quiescing the queue just looks bogus to me and will end up with some sort of race here or there.