Re: [PATCH] md/raid5: recheck if reshape has finished with device_lock held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 27, 2024 at 08:17:51PM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2024/06/27 13:37, Benjamin Marzinski 写道:
> > When handling an IO request, MD checks if a reshape is currently
> > happening, and if so, where the IO sector is in relation to the reshape
> > progress. MD uses conf->reshape_progress for both of these tasks.  When
> > the reshape finishes, conf->reshape_progress is set to MaxSector.  If
> > this occurs after MD checks if the reshape is currently happening but
> > before it calls ahead_of_reshape(), then ahead_of_reshape() will end up
> > comparing the IO sector against MaxSector. During a backwards reshape,
> > this will make MD think the IO sector is in the area not yet reshaped,
> > causing it to use the previous configuration, and map the IO to the
> > sector where that data was before the reshape.
> > 
> > This bug can be triggered by running the lvm2
> > lvconvert-raid-reshape-linear_to_raid6-single-type.sh test in a loop,
> > although it's very hard to reproduce.
> > 
> > Fix this by rechecking if the reshape has finished after grabbing the
> > device_lock.
> > 
> > Fixes: fef9c61fdfabf ("md/raid5: change reshape-progress measurement to cope with reshaping backwards.")
> > Signed-off-by: Benjamin Marzinski <bmarzins@xxxxxxxxxx>
> > ---
> >   drivers/md/raid5.c | 18 ++++++++++--------
> >   1 file changed, 10 insertions(+), 8 deletions(-)
> > 
> 
> Thanks for the patch, it looks correct. However, can you factor out a
> helper and make the code more readable? The code is already quite
> complicated.

Sure.
 
> Thanks,
> Kuai
> 
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 547fd15115cd..65d9b1ca815c 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -5923,15 +5923,17 @@ static enum stripe_result make_stripe_request(struct mddev *mddev,
> >   		 * to check again.
> >   		 */
> >   		spin_lock_irq(&conf->device_lock);
> > -		if (ahead_of_reshape(mddev, logical_sector,
> > -				     conf->reshape_progress)) {
> > -			previous = 1;
> > -		} else {
> > +		if (conf->reshape_progress != MaxSector) {
> >   			if (ahead_of_reshape(mddev, logical_sector,
> > -					     conf->reshape_safe)) {
> > -				spin_unlock_irq(&conf->device_lock);
> > -				ret = STRIPE_SCHEDULE_AND_RETRY;
> > -				goto out;
> > +					     conf->reshape_progress)) {
> > +				previous = 1;
> > +			} else {
> > +				if (ahead_of_reshape(mddev, logical_sector,
> > +						     conf->reshape_safe)) {
> > +					spin_unlock_irq(&conf->device_lock);
> > +					ret = STRIPE_SCHEDULE_AND_RETRY;
> > +					goto out;
> > +				}
> >   			}
> >   		}
> >   		spin_unlock_irq(&conf->device_lock);
> > 





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux