Re: [PATCH] mdadm: stop using 'idle' for sysfs api "sync_action" to wake up sync_thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/01/12 19:44, Mariusz Tkaczyk 写道:
On Thu, 11 Jan 2024 20:05:05 +0800
Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

From: Yu Kuai <yukuai3@xxxxxxxxxx>

Echo 'idle' to "sync_action" is supposed to stop sync_thread while new
sync_thread can still start. However, currently this behaviour is not
correct, echo 'idle' will actually try to stop sync_thread and then
start a new sync_thread. And mdadm relies on this wrong behaviour in
some places.

In kernel, if resync is not done yet, then recovery/reshape/check/repair
can't not start in the first place, and if resync is done, echo 'resync'
behaves the same as echo 'idle' for now.

Hi Kuai,
From the last part I understand that in case of resync/reshape frozen thread is
unblocked, not restarted.

I miss some explanation about that here. So far I understand is:

"Setting "resync" or "reshape" allow to continue frozen sync_thread instead
restarting it. Setting "resync" if resync is done, has same effect as "idle" so
it is safe."

Please describe setting "reshape", I can see that you use it in one place, I
think that with reshape we need to be more careful but you are the expert here,
maybe it is same as "resync"?

The only place to use "reshape" is that reshape is frozed before, and
echo "reshape" can continue the interrupted reshape. Of coures, echo
"idle/resync" can also continue the interrupted reshape.


Hence replace echo 'idle' with echo 'resync/reshape' when trying to
continue frozed sync_thread. There should be no functional changes and
prevent regressions after fixing that echo 'idle' will start new
sync_thread in kernel.

Ok, so this is kind of preparing for kernel fix. Got it.

Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---

I think that I understand purpose of the change. You are trying to avoid thread
restarting if not needed and remove reference to incorrect "idle" usage of
mdadm.
Unfortunately, the changes you need to make have strong reference to kernel
implementation. It requires to well describe them because blame is volatile.

I would like to propose separate enum to not rely on kernel states naming, some
proposals:

/* So far I understand write "resync" for both cases */
SYNC_ACTION_RESYNC_START
SYNC_ACTION_RESYNC_CONTINUE

/* So far I understand write "reshape" for both cases *
SYNC_ACTION_RESHAPE_START
SYNC_ACTION_RESHAPE_CONTINUE

/* Highlight known bug in comment and use "resync"? /*
SYNC_ACTION_IDLE
/* If needed? */
SYNC_ACTION_ABORT

The enum sounds good, and I thought about enum in kerenl as well,
currently resync/recovery/reshape/... status is determined by
combination of flags, which is hard to follow.

It needs to be handled by proper function which will have comments describing
what is written to kernel and why. In userspace, I need more user/reader
friendly code.
I want to know what we exactly requested from kernel. In some cases we would
expect to restart thread is some other cases just to continue frozen one. I
would like to know what was a purpose of request in the particular case even if
now the same action is used behind.

Sounds like a good plan, and looks like the first thing to do is to sort
out all the places to use sysfs api 'sync_action' in mdadm, then we'll
know to define the new helper(I just grep the case 'idle' for now).

Thanks,
Kuai


Let me know what you think.

Thanks,
Mariusz

.






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux