[PATCH-next] block: fix null-deref in percpu_ref_put

Zhong Jinghua <zhongjinghua@xxxxxxxxxx> · Tue, 6 Dec 2022 17:09:39 +0800

A problem was find in stable 5.10 and the root cause of it like below.

In the use of q_usage_counter of request_queue, blk_cleanup_queue using
"wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))"
to wait q_usage_counter becoming zero. however, if the q_usage_counter
becoming zero quickly, and percpu_ref_exit will execute and ref->data
will be freed, maybe another process will cause a null-defef problem
like below:

	CPU0                             CPU1
blk_mq_destroy_queue
 blk_freeze_queue
  blk_mq_freeze_queue_wait
				scsi_end_request
				 percpu_ref_get
				 ...
				 percpu_ref_put
				  atomic_long_sub_and_test
 blk_put_queue
  kobject_put
   kref_put
    blk_release_queue
     percpu_ref_exit
      ref->data -> NULL
   				   ref->data->release(ref) -> null-deref

As suggested by Ming Lei, fix it by getting the release method before
the referebce count is minus 0.

Suggested-by: Ming Lei <ming.lei@xxxxxxxxxx>
Signed-off-by: Zhong Jinghua <zhongjinghua@xxxxxxxxxx>
---
 include/linux/percpu-refcount.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index d73a1c08c3e3..11e717c95acb 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -331,8 +331,11 @@ static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr)
 
 	if (__ref_is_percpu(ref, &percpu_count))
 		this_cpu_sub(*percpu_count, nr);
-	else if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count)))
-		ref->data->release(ref);
+	else {
+		percpu_ref_func_t *release = ref->data->release;
+		if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count)))
+			release(ref);
+	}
 
 	rcu_read_unlock();
 }
-- 
2.31.1