Dear Heming,
Am 08.04.21 um 07:52 schrieb heming.zhao@xxxxxxxx:
On 4/8/21 1:09 PM, Paul Menzel wrote:
Am 08.04.21 um 05:01 schrieb Heming Zhao:
md_kick_rdev_from_array will remove rdev, so we should
use rdev_for_each_safe to search list.
How to trigger:
```
for i in {1..20}; do
echo ==== $i `date` ====;
mdadm -Ss && ssh ${node2} "mdadm -Ss"
wipefs -a /dev/sda /dev/sdb
mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l 1 /dev/sda \
/dev/sdb --assume-clean
ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb"
mdadm --wait /dev/md0
ssh ${node2} "mdadm --wait /dev/md0"
mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda
sleep 1
done
```
In the test script, I do not understand, what node2 is used for, where
you log in over SSH.
The bug can only be triggered in cluster env. There are two nodes (in cluster),
To run this script on node1, and need ssh to node2 to execute some cmds.
${node2} stands for node2 ip address. e.g.: ssh 192.168.0.3 "mdadm
--wait ..."
Please excuse my ignorance. I guess some other component is needed to
connect the two RAID devices on each node? At least you never tell mdadm
directly to use *node2*. Reading *Cluster Multi-device (Cluster MD)* [1]
a resource agent is needed.
... ...
Signed-off-by: Heming Zhao <heming.zhao@xxxxxxxx>
Reviewed-by: Gang He <ghe@xxxxxxxx>
If there is a commit, your patch is fixing, please add a Fixes: tag.
OK, I forgot it, will send v2 patch later.
Awesome.
Kind regards,
Paul
[1]:
https://documentation.suse.com/sle-ha/12-SP4/html/SLE-HA-all/cha-ha-cluster-md.html