This is a note to let you know that I've just added the patch titled raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang to the 4.5-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: raid1-include-bio_end_io_list-in-nr_queued-to-prevent-freeze_array-hang.patch and it can be found in the queue-4.5 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From ccfc7bf1f09d6190ef86693ddc761d5fe3fa47cb Mon Sep 17 00:00:00 2001 From: Nate Dailey <nate.dailey@xxxxxxxxxxx> Date: Mon, 29 Feb 2016 10:43:58 -0500 Subject: raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang From: Nate Dailey <nate.dailey@xxxxxxxxxxx> commit ccfc7bf1f09d6190ef86693ddc761d5fe3fa47cb upstream. If raid1d is handling a mix of read and write errors, handle_read_error's call to freeze_array can get stuck. This can happen because, though the bio_end_io_list is initially drained, writes can be added to it via handle_write_finished as the retry_list is processed. These writes contribute to nr_pending but are not included in nr_queued. If a later entry on the retry_list triggers a call to handle_read_error, freeze array hangs waiting for nr_pending == nr_queued+extra. The writes on the bio_end_io_list aren't included in nr_queued so the condition will never be satisfied. To prevent the hang, include bio_end_io_list writes in nr_queued. There's probably a better way to handle decrementing nr_queued, but this seemed like the safest way to avoid breaking surrounding code. I'm happy to supply the script I used to repro this hang. Fixes: 55ce74d4bfe1b(md/raid1: ensure device failure recorded before write request returns.) Signed-off-by: Nate Dailey <nate.dailey@xxxxxxxxxxx> Signed-off-by: Shaohua Li <shli@xxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/md/raid1.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2274,6 +2274,7 @@ static void handle_write_finished(struct if (fail) { spin_lock_irq(&conf->device_lock); list_add(&r1_bio->retry_list, &conf->bio_end_io_list); + conf->nr_queued++; spin_unlock_irq(&conf->device_lock); md_wakeup_thread(conf->mddev->thread); } else { @@ -2391,8 +2392,10 @@ static void raid1d(struct md_thread *thr LIST_HEAD(tmp); spin_lock_irqsave(&conf->device_lock, flags); if (!test_bit(MD_CHANGE_PENDING, &mddev->flags)) { - list_add(&tmp, &conf->bio_end_io_list); - list_del_init(&conf->bio_end_io_list); + while (!list_empty(&conf->bio_end_io_list)) { + list_move(conf->bio_end_io_list.prev, &tmp); + conf->nr_queued--; + } } spin_unlock_irqrestore(&conf->device_lock, flags); while (!list_empty(&tmp)) { Patches currently in stable-queue which might be from nate.dailey@xxxxxxxxxxx are queue-4.5/raid10-include-bio_end_io_list-in-nr_queued-to-prevent-freeze_array-hang.patch queue-4.5/raid1-include-bio_end_io_list-in-nr_queued-to-prevent-freeze_array-hang.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html