Hi Ming, On 8/18/21 9:09 AM, Ming Lei wrote: > is_flush_rq() is called from bt_iter()/bt_tags_iter(), and runs the > following check: > > hctx->fq->flush_rq == req > > but the passed hctx from bt_iter()/bt_tags_iter() may be NULL because: > > 1) memory re-order in blk_mq_rq_ctx_init(): > > rq->mq_hctx = data->hctx; > ... > refcount_set(&rq->ref, 1); > > OR > > 2) tag re-use and ->rqs[] isn't updated with new request. > > Fix the issue by re-writing is_flush_rq() as: > > return rq->end_io == flush_end_io; > > which turns out simpler to follow and immune to data race since we have > ordered WRITE rq->end_io and refcount_set(&rq->ref, 1). > Recently we've run into a similar crash due to NULL rq->mq_hctx in blk_mq_put_rq_ref() on ARM, and it is a normal write request. Since memory reorder truly exists, we may also risk other uninitialized member accessing after this commit, at least we have to be more careful in busy_iter_fn... So here you don't use memory barrier before refcount_set() is for performance consideration? Thanks, Joseph