Re: [PATCH v4 1/2] md/cluster: block reshape with remote resync job

Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> · Wed, 18 Nov 2020 18:14:19 +0100



On Thu, Nov 19, 2020 at 12:45:53AM +0800, Zhao Heming wrote:
> Reshape request should be blocked with ongoing resync job. In cluster
> env, a node can start resync job even if the resync cmd isn't executed
> on it, e.g., user executes "mdadm --grow" on node A, sometimes node B
> will start resync job. However, current update_raid_disks() only check
> local recovery status, which is incomplete. As a result, we see user will
> execute "mdadm --grow" successfully on local, while the remote node deny
> to do reshape job when it doing resync job. The inconsistent handling
> cause array enter unexpected status. If user doesn't observe this issue
> and continue executing mdadm cmd, the array doesn't work at last.
> 
> Fix this issue by blocking reshape request. When node executes "--grow"
> and detects ongoing resync, it should stop and report error to user.
> 
> The following script reproduces the issue with ~100% probability.
> (two nodes share 3 iSCSI luns: sdg/sdh/sdi. Each lun size is 1GB)
> ```
>  # on node1, node2 is the remote node.
> ssh root@node2 "mdadm -S --scan"
> mdadm -S --scan
> for i in {g,h,i};do dd if=/dev/zero of=/dev/sd$i oflag=direct bs=1M \
> count=20; done
> 
> mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdg /dev/sdh
> ssh root@node2 "mdadm -A /dev/md0 /dev/sdg /dev/sdh"
> 
> sleep 5
> 
> mdadm --manage --add /dev/md0 /dev/sdi
> mdadm --wait /dev/md0
> mdadm --grow --raid-devices=3 /dev/md0
> 
> mdadm /dev/md0 --fail /dev/sdg
> mdadm /dev/md0 --remove /dev/sdg
> mdadm --grow --raid-devices=2 /dev/md0
> ```
> 
> Signed-off-by: Zhao Heming <heming.zhao@xxxxxxxx>
> ---
>  drivers/md/md.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>