Hi Kuai. > Jove, As I understand this, if mdadm make progress without a blocked > io, and reshape continues, it seems you can use this array without > problem I've had to do some sleuthing to figure out who was doing that array access, I was already running a minimal FedoraCore image. I've discovered that the culprit is the systemd-udevd daemon. I do not know why it accesses the array but if I stop it and rename that executable (it gets started automatically when the array is assembled) then the reshape continues. Now it is just a matter of time until the reshape is finished and I can discover just how much data I still have :) Thank you all for your help, I will send a last mail when I know more. Best regards, Johan On Fri, May 5, 2023 at 10:02 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2023/05/05 14:58, Wol 写道: > > On 05/05/2023 02:34, Yu Kuai wrote: > >>> I have had one case in which mdadm didn't hang and in which the > >>> reshape continued. Sadly, I was using sparse overlay files and the > >>> filesystem could not handle the full 4x 4TB. I had to terminate the > >>> reshape. > >> > >> This sounds like a dead end for now, normal io beyond reshape position > >> must wait: > >> > >> raid5_make_request > >> make_stripe_request > >> ahead_of_reshape > >> wait_woken > > > > Not sure if I've got the wrong end of the stick, but if I've understood > > correctly, that shouldn't be the case. > > > > Reshape takes place in a window. All io *beyond* the window is allowed > > to proceed normally - that part of the array has not been reshaped so > > the old parameters are used. > > > > All io *in front* of the window is allowed to proceed normally - that > > part of the array has been reshaped so the new parameters are used. > > > > io *IN* the window is paused until the window has passed. This > > interruption should be short and sweet. > > Yes, it's correct, and in this case reshape_safe should be the same as > reshapge_progress, and I guess io is stuck because > stripe_ahead_of_reshape() return true. > > So this deadlock happens when io is blocked because of reshape, and > mddev_suspend() is waiting for this io to be done, in the meantime > reshape can't start untill mddev_suspend() returns. > > Jove, As I understand this, if mdadm make progress without a blocked > io, and reshape continues, it seems you can use this array without > problem. > > Thanks, > Kuai > > > > Cheers, > > Wol > > > > . > > >