Hi,
在 2023/05/05 23:47, Jove 写道:
Hi Kuai.
Jove, As I understand this, if mdadm make progress without a blocked
io, and reshape continues, it seems you can use this array without
problem
I've had to do some sleuthing to figure out who was doing that array
access, I was already running a minimal FedoraCore image. I've
discovered that the culprit is the systemd-udevd daemon. I do not know
why it accesses the array but if I stop it and rename that executable
(it gets started automatically when the array is assembled) then the
reshape continues.
Thanks for confirming this, however, I have no idea why systemd-udevd is
accessing the array.
In the meantime, I'll try to fix this deadlock, hope you don't mind a
reported-by tag.
Thanks,
Kuai
Now it is just a matter of time until the reshape is finished and I
can discover just how much data I still have :)
Thank you all for your help, I will send a last mail when I know more.
Best regards,
Johan
On Fri, May 5, 2023 at 10:02 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
Hi,
在 2023/05/05 14:58, Wol 写道:
On 05/05/2023 02:34, Yu Kuai wrote:
I have had one case in which mdadm didn't hang and in which the
reshape continued. Sadly, I was using sparse overlay files and the
filesystem could not handle the full 4x 4TB. I had to terminate the
reshape.
This sounds like a dead end for now, normal io beyond reshape position
must wait:
raid5_make_request
make_stripe_request
ahead_of_reshape
wait_woken
Not sure if I've got the wrong end of the stick, but if I've understood
correctly, that shouldn't be the case.
Reshape takes place in a window. All io *beyond* the window is allowed
to proceed normally - that part of the array has not been reshaped so
the old parameters are used.
All io *in front* of the window is allowed to proceed normally - that
part of the array has been reshaped so the new parameters are used.
io *IN* the window is paused until the window has passed. This
interruption should be short and sweet.
Yes, it's correct, and in this case reshape_safe should be the same as
reshapge_progress, and I guess io is stuck because
stripe_ahead_of_reshape() return true.
So this deadlock happens when io is blocked because of reshape, and
mddev_suspend() is waiting for this io to be done, in the meantime
reshape can't start untill mddev_suspend() returns.
Jove, As I understand this, if mdadm make progress without a blocked
io, and reshape continues, it seems you can use this array without
problem.
Thanks,
Kuai
Cheers,
Wol
.
.