Hello List, There are two patches to fix md-cluster bugs. The 2 different bugs can use same test script to trigger: ``` ssh root@node2 "mdadm -S --scan" mdadm -S --scan for i in {g,h,i};do dd if=/dev/zero of=/dev/sd$i oflag=direct bs=1M \ count=20; done echo "mdadm create array" mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdg /dev/sdh \ --bitmap-chunk=1M echo "set up array on node2" ssh root@node2 "mdadm -A /dev/md0 /dev/sdg /dev/sdh" sleep 5 mkfs.xfs /dev/md0 mdadm --manage --add /dev/md0 /dev/sdi mdadm --wait /dev/md0 mdadm --grow --raid-devices=3 /dev/md0 mdadm /dev/md0 --fail /dev/sdg mdadm /dev/md0 --remove /dev/sdg mdadm --grow --raid-devices=2 /dev/md0 ``` For detail, please check each patch commit log. ------- v4: - revise subject & commit log on both patches - no change for code v3: - patch 1/2 - no change - patch 2/2 - use Xiao's solution to fix - revise commit log for the "How to fix" part v2: - patch 1/2 - change patch subject - add test result in commit log - no change for code - patch 2/2 - add test result in commit log - add error handling of remove_disk in hot_remove_disk - add error handling of lock_comm in all caller - remove 5s timeout fix in receive side (for process_metadata_update) v1: - create patch ------- Zhao Heming (2): md/cluster: block reshape with remote resync job md/cluster: fix deadlock when node is doing resync job drivers/md/md-cluster.c | 69 +++++++++++++++++++++++------------------ drivers/md/md.c | 14 ++++++--- 2 files changed, 49 insertions(+), 34 deletions(-) -- 2.27.0