Re: [PATCH -next v6 5/5] md: protect md_thread with rcu

Logan Gunthorpe <logang@xxxxxxxxxxxx> · Thu, 13 Apr 2023 09:10:45 -0600

On 2023-04-13 05:32, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
> 
> Our test reports a uaf for 'mddev->sync_thread':
> 
> T1                      T2
> md_start_sync
>  md_register_thread
>  // mddev->sync_thread is set
> 			raid1d
> 			 md_check_recovery
> 			  md_reap_sync_thread
> 			   md_unregister_thread
> 			    kfree
> 
>  md_wakeup_thread
>   wake_up
>   ->sync_thread was freed
> 
> Root cause is that there is a small windown between register thread and
> wake up thread, where the thread can be freed concurrently.
> 
> Currently, a global spinlock 'pers_lock' is borrowed to protect
> 'mddev->thread', this problem can be fixed likewise, however, there are
> similar problems elsewhere, and use a global lock for all the cases is
> not good.
> 
> This patch protect all md_thread with rcu.
> 
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
> ---
>  drivers/md/md-bitmap.c    | 10 ++++--
>  drivers/md/md-cluster.c   |  7 ++--
>  drivers/md/md-multipath.c |  4 +--
>  drivers/md/md.c           | 69 ++++++++++++++++++---------------------
>  drivers/md/md.h           |  8 ++---
>  drivers/md/raid1.c        |  7 ++--
>  drivers/md/raid1.h        |  2 +-
>  drivers/md/raid10.c       | 21 +++++++-----
>  drivers/md/raid10.h       |  2 +-
>  drivers/md/raid5-cache.c  | 22 ++++++++-----
>  drivers/md/raid5.c        | 15 +++++----
>  drivers/md/raid5.h        |  2 +-
>  12 files changed, 91 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index 29fd41ef55a6..ab27f66dbb1f 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -1221,13 +1221,19 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
>  static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
>  			      bool force)
>  {
> -	struct md_thread *thread = mddev->thread;
> +	struct md_thread *thread;
> +
> +	rcu_read_lock();
> +	thread = rcu_dereference(mddev->thread);
>  
>  	if (!thread)
> -		return;
> +		goto out;
>  
>  	if (force || thread->timeout < MAX_SCHEDULE_TIMEOUT)
>  		thread->timeout = timeout;
> +
> +out:
> +	rcu_read_unlock();
>  }
>  
>  /*
> diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
> index 10e0c5381d01..672dfa6a40ec 100644
> --- a/drivers/md/md-cluster.c
> +++ b/drivers/md/md-cluster.c
> @@ -362,8 +362,8 @@ static void __recover_slot(struct mddev *mddev, int slot)
>  
>  	set_bit(slot, &cinfo->recovery_map);
>  	if (!cinfo->recovery_thread) {
> -		cinfo->recovery_thread = md_register_thread(recover_bitmaps,
> -				mddev, "recover");
> +		rcu_assign_pointer(cinfo->recovery_thread,
> +			md_register_thread(recover_bitmaps, mddev, "recover"));
>  		if (!cinfo->recovery_thread) {
>  			pr_warn("md-cluster: Could not create recovery thread\n");
>  			return;
> @@ -889,7 +889,8 @@ static int join(struct mddev *mddev, int nodes)
>  	}
>  	/* Initiate the communication resources */
>  	ret = -ENOMEM;
> -	cinfo->recv_thread = md_register_thread(recv_daemon, mddev, "cluster_recv");
> +	rcu_assign_pointer(cinfo->recv_thread,
> +			md_register_thread(recv_daemon, mddev, "cluster_recv"));
>  	if (!cinfo->recv_thread) {

This looks like it'll hit sparse warnings. without an
rcu_access_pointer(). Might be nicer to use a temporary variable, check
that it's not null, then call rcu_assign_pointer().

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel