Re: [PATCH -next v5 6/6] md: protect md_thread with rcu

Logan Gunthorpe <logang@xxxxxxxxxxxx> · Mon, 10 Apr 2023 09:42:35 -0600

On 2023-04-10 05:35, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
> 
> Our test reports a uaf for 'mddev->sync_thread':
> 
> T1                      T2
> md_start_sync
>  md_register_thread
>  // mddev->sync_thread is set
> 			raid1d
> 			 md_check_recovery
> 			  md_reap_sync_thread
> 			   md_unregister_thread
> 			    kfree
> 
>  md_wakeup_thread
>   wake_up
>   ->sync_thread was freed
> 
> Root cause is that there is a small windown between register thread and
> wake up thread, where the thread can be freed concurrently.
> 
> Currently, a global spinlock 'pers_lock' is borrowed to protect
> 'mddev->thread', this problem can be fixed likewise, however, there might
> be similar problem elsewhere, and use a global lock for all the cases is
> not good.
> 
> This patch protect md_thread with rcu.
> 
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
> ---
>  drivers/md/md-bitmap.c   | 29 ++++++++++++-----
>  drivers/md/md.c          | 68 +++++++++++++++++++---------------------
>  drivers/md/md.h          | 10 +++---
>  drivers/md/raid1.c       |  4 +--
>  drivers/md/raid1.h       |  2 +-
>  drivers/md/raid10.c      | 10 ++++--
>  drivers/md/raid10.h      |  2 +-
>  drivers/md/raid5-cache.c | 15 +++++----
>  drivers/md/raid5.c       |  4 +--
>  drivers/md/raid5.h       |  2 +-
>  10 files changed, 81 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index 29fd41ef55a6..b9baeea5605e 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
>  					       int create);
>  
>  static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
> -			      bool force)
> +			      bool force, bool protected)
>  {
> -	struct md_thread *thread = mddev->thread;
> +	struct md_thread *thread;
> +
> +	if (!protected) {
> +		rcu_read_lock();
> +		thread = rcu_dereference(mddev->thread);
> +	} else {
> +		thread = rcu_dereference_protected(mddev->thread,
> +				lockdep_is_held(&mddev->reconfig_mutex));
> +	}

Why not just always use rcu_read_lock()? Even if it's safe with
reconfig_mutex, it wouldn't harm much and would make the code a bit less
ugly.

> @@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio)
>   */
>  void mddev_suspend(struct mddev *mddev)
>  {
> -	WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk);
> -	lockdep_assert_held(&mddev->reconfig_mutex);
> +	struct md_thread *thread = rcu_dereference_protected(mddev->thread,
> +			lockdep_is_held(&mddev->reconfig_mutex));

Do we know that reconfig_mutex is always held when we call
md_unregister_thread()? Seems plausible, but maybe it's worth adding a
lockdep_assert_held() to md_unregsiter_thread().

Thanks,

Logan