Re: Raid5 to raid6 grow interrupted, mdadm hangs on assemble command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kuai.

> Jove, As I understand this, if mdadm make progress without a blocked
> io, and reshape continues, it seems you can use this array without
> problem

I've had to do some sleuthing to figure out who was doing that array
access, I was already running a minimal FedoraCore image. I've
discovered that the culprit is the systemd-udevd daemon. I do not know
why it accesses the array but if I stop it and rename that executable
(it gets started automatically when the array is assembled) then the
reshape continues.

Now it is just a matter of time until the reshape is finished and I
can discover just how much data I still have :)

Thank you all for your help, I will send a last mail when I know more.

Best regards,

      Johan



On Fri, May 5, 2023 at 10:02 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> 在 2023/05/05 14:58, Wol 写道:
> > On 05/05/2023 02:34, Yu Kuai wrote:
> >>> I have had one case in which mdadm didn't hang and in which the
> >>> reshape continued. Sadly, I was using sparse overlay files and the
> >>> filesystem could not handle the full 4x 4TB. I had to terminate the
> >>> reshape.
> >>
> >> This sounds like a dead end for now, normal io beyond reshape position
> >> must wait:
> >>
> >> raid5_make_request
> >>   make_stripe_request
> >>    ahead_of_reshape
> >>     wait_woken
> >
> > Not sure if I've got the wrong end of the stick, but if I've understood
> > correctly, that shouldn't be the case.
> >
> > Reshape takes place in a window. All io *beyond* the window is allowed
> > to proceed normally - that part of the array has not been reshaped so
> > the old parameters are used.
> >
> > All io *in front* of the window is allowed to proceed normally - that
> > part of the array has been reshaped so the new parameters are used.
> >
> > io *IN* the window is paused until the window has passed. This
> > interruption should be short and sweet.
>
> Yes, it's correct, and in this case reshape_safe should be the same as
> reshapge_progress, and I guess io is stuck because
> stripe_ahead_of_reshape() return true.
>
> So this deadlock happens when io is blocked because of reshape, and
> mddev_suspend() is waiting for this io to be done, in the meantime
> reshape can't start untill mddev_suspend() returns.
>
> Jove, As I understand this, if mdadm make progress without a blocked
> io, and reshape continues, it seems you can use this array without
> problem.
>
> Thanks,
> Kuai
> >
> > Cheers,
> > Wol
> >
> > .
> >
>




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux