Patch "md/raid10: fix task hung in raid10d" has been added to the 6.2-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Sat, 6 May 2023 09:01:53 -0400

This is a note to let you know that I've just added the patch titled

    md/raid10: fix task hung in raid10d

to the 6.2-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     md-raid10-fix-task-hung-in-raid10d.patch
and it can be found in the queue-6.2 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 58491d052144e13b1d17e50ba3a8b5cd72c72017
Author: Li Nan <linan122@xxxxxxxxxx>
Date:   Wed Feb 22 12:09:59 2023 +0800

    md/raid10: fix task hung in raid10d
    
    [ Upstream commit 72c215ed8731c88b2d7e09afc51fffc207ae47b8 ]
    
    commit fe630de009d0 ("md/raid10: avoid deadlock on recovery.") allowed
    normal io and sync io to exist at the same time. Task hung will occur as
    below:
    
    T1                      T2              T3              T4
    raid10d
     handle_read_error
      allow_barrier
       conf->nr_pending--
        -> 0
                            //submit sync io
                            raid10_sync_request
                             raise_barrier
                              ->will not be blocked
                              ...
                            //submit to drivers
      raid10_read_request
       wait_barrier
        conf->nr_pending++
         -> 1
                                            //retry read fail
                                            raid10_end_read_request
                                             reschedule_retry
                                              add to retry_list
                                              conf->nr_queued++
                                               -> 1
                                                            //sync io fail
                                                            end_sync_read
                                                             __end_sync_read
                                                              reschedule_retry
                                                               add to retry_list
                                                                conf->nr_queued++
                                                                 -> 2
     ...
     handle_read_error
     get form retry_list
     conf->nr_queued--
      freeze_array
       wait nr_pending == nr_queued+1
            ->1           ->2
       //task hung
    
    retry read and sync io will be added to retry_list(nr_queued->2) if they
    fails. raid10d() called handle_read_error() and hung in freeze_array().
    nr_queued will not decrease because raid10d is blocked, nr_pending will
    not increase because conf->barrier is not released.
    
    Fix it by moving allow_barrier() after raid10_read_request().
    raise_barrier() will wait for nr_waiting to become 0. Therefore, sync io
    and regular io will not be issued at the same time.
    
    Also remove the check of nr_queued in stop_waiting_barrier. It can be 0
    but don't need to be blocking. Remove the check for MD_RECOVERY_RUNNING as
    the check is redundent.
    
    Fixes: fe630de009d0 ("md/raid10: avoid deadlock on recovery.")
    Signed-off-by: Li Nan <linan122@xxxxxxxxxx>
    Signed-off-by: Song Liu <song@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20230222041000.3341651-2-linan666@xxxxxxxxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 6c66357f92f55..db9ee3b637d6f 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -995,11 +995,15 @@ static bool stop_waiting_barrier(struct r10conf *conf)
 	    (!bio_list_empty(&bio_list[0]) || !bio_list_empty(&bio_list[1])))
 		return true;
 
-	/* move on if recovery thread is blocked by us */
-	if (conf->mddev->thread->tsk == current &&
-	    test_bit(MD_RECOVERY_RUNNING, &conf->mddev->recovery) &&
-	    conf->nr_queued > 0)
+	/*
+	 * move on if io is issued from raid10d(), nr_pending is not released
+	 * from original io(see handle_read_error()). All raise barrier is
+	 * blocked until this io is done.
+	 */
+	if (conf->mddev->thread->tsk == current) {
+		WARN_ON_ONCE(atomic_read(&conf->nr_pending) == 0);
 		return true;
+	}
 
 	return false;
 }
@@ -2978,9 +2982,13 @@ static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio)
 		md_error(mddev, rdev);
 
 	rdev_dec_pending(rdev, mddev);
-	allow_barrier(conf);
 	r10_bio->state = 0;
 	raid10_read_request(mddev, r10_bio->master_bio, r10_bio);
+	/*
+	 * allow_barrier after re-submit to ensure no sync io
+	 * can be issued while regular io pending.
+	 */
+	allow_barrier(conf);
 }
 
 static void handle_write_completed(struct r10conf *conf, struct r10bio *r10_bio)