On Wed, Dec 14, 2022 at 04:16:51PM +0800, Hillf Danton wrote: > On 14 Dec 2022 10:51:01 +0800 Ming Lei <ming.lei@xxxxxxxxxx> > > The pattern of wait_event(percpu_ref_is_zero()) has been used in several > > For example? blk_mq_freeze_queue_wait() and target_wait_for_sess_cmds(). > > > kernel components, and this way actually has the following risk: > > > > - percpu_ref_is_zero() can be returned just between > > atomic_long_sub_and_test() and ref->data->release(ref) > > > > - given the refcount is found as zero, percpu_ref_exit() could > > be called, and the host data structure is freed > > > > - then use-after-free is triggered in ->release() when the user host > > data structure is freed after percpu_ref_exit() returns > > The race between exit and the release callback should be considered at the > corresponding callsite, given the comment below, and closed for instance > by synchronizing rcu. > > /** > * percpu_ref_put_many - decrement a percpu refcount > * @ref: percpu_ref to put > * @nr: number of references to put > * > * Decrement the refcount, and if 0, call the release function (which was passed > * to percpu_ref_init()) > * > * This function is safe to call as long as @ref is between init and exit. > */ Not sure if the above comment implies that the callsite should cover the race. But blk-mq can really avoid the trouble by using the existed call_rcu(): diff --git a/block/blk-core.c b/block/blk-core.c index 3866b6c4cd88..9321767470dc 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -254,14 +254,15 @@ EXPORT_SYMBOL_GPL(blk_clear_pm_only); static void blk_free_queue_rcu(struct rcu_head *rcu_head) { - kmem_cache_free(blk_requestq_cachep, - container_of(rcu_head, struct request_queue, rcu_head)); + struct request_queue *q = container_of(rcu_head, + struct request_queue, rcu_head); + + percpu_ref_exit(&q->q_usage_counter); + kmem_cache_free(blk_requestq_cachep, q); } static void blk_free_queue(struct request_queue *q) { - percpu_ref_exit(&q->q_usage_counter); - if (q->poll_stat) blk_stat_remove_callback(q, q->poll_cb); blk_stat_free_callback(q->poll_cb); Thanks, Ming