This is a note to let you know that I've just added the patch titled md/raid6: use valid sector values to determine if an I/O should wait on the reshape to the 6.6-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: md-raid6-use-valid-sector-values-to-determine-if-an-i-o-should-wait-on-the-reshape.patch and it can be found in the queue-6.6 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From c467e97f079f0019870c314996fae952cc768e82 Mon Sep 17 00:00:00 2001 From: David Jeffery <djeffery@xxxxxxxxxx> Date: Tue, 28 Nov 2023 13:11:39 -0500 Subject: md/raid6: use valid sector values to determine if an I/O should wait on the reshape From: David Jeffery <djeffery@xxxxxxxxxx> commit c467e97f079f0019870c314996fae952cc768e82 upstream. During a reshape or a RAID6 array such as expanding by adding an additional disk, I/Os to the region of the array which have not yet been reshaped can stall indefinitely. This is from errors in the stripe_ahead_of_reshape function causing md to think the I/O is to a region in the actively undergoing the reshape. stripe_ahead_of_reshape fails to account for the q disk having a sector value of 0. By not excluding the q disk from the for loop, raid6 will always generate a min_sector value of 0, causing a return value which stalls. The function's max_sector calculation also uses min() when it should use max(), causing the max_sector value to always be 0. During a backwards rebuild this can cause the opposite problem where it allows I/O to advance when it should wait. Fixing these errors will allow safe I/O to advance in a timely manner and delay only I/O which is unsafe due to stripes in the middle of undergoing the reshape. Fixes: 486f60558607 ("md/raid5: Check all disks in a stripe_head for reshape progress") Cc: stable@xxxxxxxxxxxxxxx # v6.0+ Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx> Tested-by: Laurence Oberman <loberman@xxxxxxxxxx> Signed-off-by: Song Liu <song@xxxxxxxxxx> Link: https://lore.kernel.org/r/20231128181233.6187-1-djeffery@xxxxxxxxxx Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/md/raid5.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -5892,11 +5892,11 @@ static bool stripe_ahead_of_reshape(stru int dd_idx; for (dd_idx = 0; dd_idx < sh->disks; dd_idx++) { - if (dd_idx == sh->pd_idx) + if (dd_idx == sh->pd_idx || dd_idx == sh->qd_idx) continue; min_sector = min(min_sector, sh->dev[dd_idx].sector); - max_sector = min(max_sector, sh->dev[dd_idx].sector); + max_sector = max(max_sector, sh->dev[dd_idx].sector); } spin_lock_irq(&conf->device_lock); Patches currently in stable-queue which might be from djeffery@xxxxxxxxxx are queue-6.6/md-raid6-use-valid-sector-values-to-determine-if-an-i-o-should-wait-on-the-reshape.patch