Re: [PATCH v3 3/3] md/raid1: check array size before reshape

Xueshi Hu <xueshi.hu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> · Sat, 29 Jul 2023 11:29:53 +0800



On Sat, Jul 29, 2023 at 08:58:45AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/07/28 22:42, Xueshi Hu 写道:
> > On Thu, Jul 20, 2023 at 09:28:34AM +0800, Yu Kuai wrote:
> > > Hi,
> > > 
> > > 在 2023/07/19 19:51, Xueshi Hu 写道:
> > > > On Wed, Jul 19, 2023 at 09:38:26AM +0200, Paul Menzel wrote:
> > > > > Dear Xueshi,
> > > > > 
> > > > > 
> > > > > Thank you for your patches.
> > > > > 
> > > > > Am 19.07.23 um 09:09 schrieb Xueshi Hu:
> > > > > > If array size doesn't changed, nothing need to do.
> > > > > 
> > > > > Maybe: … nothing needs to be done.
> > > > What a hurry, I'll be cautious next time and check the sentences with
> > > > tools.
> > > > > 
> > > > > Do you have a test case to reproduce it?
> > > > > 
> > > > Userspace command:
> > > > 
> > > > 	echo 4 > /sys/devices/virtual/block/md10/md/raid_disks
> > > > 
> > > > Kernel function calling flow:
> > > > 
> > > > md_attr_store()
> > > > 	raid_disks_store()
> > > > 		update_raid_disks()
> > > > 			raid1_reshape()
> > > > 
> > > > Maybe I shall provide more information when submit patches, thank you for
> > > > reminding me.
> > > > > 
> > > > > Kind regards,
> > > > > 
> > > > > Paul
> > > > > 
> > > > > 
> > > > > > Signed-off-by: Xueshi Hu <xueshi.hu@xxxxxxxxxx>
> > > > > > ---
> > > > > >     drivers/md/raid1.c | 9 ++++++---
> > > > > >     1 file changed, 6 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > > > > > index 62e86b7d1561..5840b8b0f9b7 100644
> > > > > > --- a/drivers/md/raid1.c
> > > > > > +++ b/drivers/md/raid1.c
> > > > > > @@ -3282,9 +3282,6 @@ static int raid1_reshape(struct mddev *mddev)
> > > > > >     	int d, d2;
> > > > > >     	int ret;
> > > > > > -	memset(&newpool, 0, sizeof(newpool));
> > > > > > -	memset(&oldpool, 0, sizeof(oldpool));
> > > > > > -
> > > > > >     	/* Cannot change chunk_size, layout, or level */
> > > > > >     	if (mddev->chunk_sectors != mddev->new_chunk_sectors ||
> > > > > >     	    mddev->layout != mddev->new_layout ||
> > > > > > @@ -3295,6 +3292,12 @@ static int raid1_reshape(struct mddev *mddev)
> > > > > >     		return -EINVAL;
> > > > > >     	}
> > > > > > +	if (mddev->delta_disks == 0)
> > > > > > +		return 0; /* nothing to do */
> > > 
> > > I think this is wrong, you should at least keep following:
> > > 
> > >          set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
> > >          set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> > >          md_wakeup_thread(mddev->thread);
> > > 
> > I fail to comprehend the rationale behind the kernel's need to invoke
> > raid1d() repeatedly despite the absence of any modifications.
> > It appears that raid1d() is responsible for invoking md_check_recovery()
> > and resubmit the failed io.
> 
> No, the point here is to set MD_RECOVERY_NEEDED and MD_RECOVERY_RECOVER,
> so that md_check_recovery() can start to reshape.
I apologize, but I am still unable to comprehend your idea.
If mmdev->delta_disks == 0 , what is the purpose of invoking
md_check_recovery() again?
> 
> Thanks,
> Kuai
> > 
> > Evidently, it is not within the scope of raid1_reshape() to resubmit
> > failed io.
> > 
> > Moreover, upon reviewing the Documentation[1], I could not find any
> > indications that reshape must undertake the actions as specified in
> > md_check_recovery()'s documentation, such as eliminating faulty devices.
> > 
> > [1]: https://www.kernel.org/doc/html/latest/admin-guide/md.html
> > > Thanks,
> > > Kuai
> > > 
> > > > > > +
> > > > > > +	memset(&newpool, 0, sizeof(newpool));
> > > > > > +	memset(&oldpool, 0, sizeof(oldpool));
> > > > > > +
> > > > > >     	if (!mddev_is_clustered(mddev))
> > > > > >     		md_allow_write(mddev);
> > > > .
> > > > 
> > > 
> > Thanks,
> > Hu
> > .
> > 
>