On Fri, May 07, 2021 at 02:31:02PM +0800, Ming Lei wrote: > > It depends on how much time will be spent inside > > blk_mq_clear_rq_mapping(). If the time spent in the nested loop in > > blk_mq_clear_rq_mapping() would be significant then the proposed change > > will help to reduce interrupt latency in blk_mq_find_and_get_req(). > > interrupt latency in blk_mq_find_and_get_req() shouldn't be increased > because interrupt won't be disabled when spinning on the lock. But interrupt > may be disabled for a while in blk_mq_clear_rq_mapping() in case of big > nr_requests and hw queue depth. > > Fair enough, will take this way for not holding lock for too long. Can we take a step back here? Once blk_mq_clear_rq_mapping hits we are deep into tearing the device down and freeing the tag_set. So if blk_mq_find_and_get_req is waiting any time on the lock something is wrong. We might as well just trylock in blk_mq_find_and_get_req and not find a request if the lock is contented as there is no point in waiting for the lock. In fact we might not even need a lock, but just an atomic bitops in the tagset that marks it as beeing freed, that needs to be tested in blk_mq_find_and_get_req with the right memory barriers.