On Tue, Apr 27, 2021 at 11:10:58PM +0800, Ming Lei wrote: > > refcount_inc_not_zero() in bt_tags_iter() still may read one freed > request. > > Fix the issue by the following approach: > > 1) hold a per-tags spinlock when reading ->rqs[tag] and calling > refcount_inc_not_zero in bt_tags_iter() > This method of closing the race still in my original patch is very nice. It's a great improvement. > 2) clearing stale request referred via ->rqs[tag] before freeing > request pool, the per-tags spinlock is held for clearing stale > ->rq[tag] > > So after we cleared stale requests, bt_tags_iter() won't observe > freed request any more, also the clearing will wait for pending > request reference. > > The idea of clearing ->rqs[] is borrowed from John Garry's previous > patch and one recent David's patch. > However, when you took my original cmpxchg patch and merged my separate function to do the cmpxchg cleaning into blk_mq_clear_rq_mapping, you missed why it was a separate function. Your patch will clean out the static_rqs requests which are being freed, but it doesn't clean out the special flush request that gets allocated individually by a request_queue. The flush request can be put directly into the rqs[] array so it also needs to be cleaned when a request_queue is being torn down. This was the second caller of my separated cleaning function. With that portion of my original patch removed, a stale pointer to a freed flush request can still remain after a request_queue is released. David Jeffery