I resend the v4 patch with correct Cc tag. On 11/19/20 7:41 PM, Zhao Heming wrote: > Hello List, > > There are two patches to fix md-cluster bugs. > > The 2 different bugs can use same test script to trigger: > > ``` > ssh root@node2 "mdadm -S --scan" > mdadm -S --scan > for i in {g,h,i};do dd if=/dev/zero of=/dev/sd$i oflag=direct bs=1M \ > count=20; done > > echo "mdadm create array" > mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdg /dev/sdh \ > --bitmap-chunk=1M > echo "set up array on node2" > ssh root@node2 "mdadm -A /dev/md0 /dev/sdg /dev/sdh" > > sleep 5 > > mkfs.xfs /dev/md0 > mdadm --manage --add /dev/md0 /dev/sdi > mdadm --wait /dev/md0 > mdadm --grow --raid-devices=3 /dev/md0 > > mdadm /dev/md0 --fail /dev/sdg > mdadm /dev/md0 --remove /dev/sdg > mdadm --grow --raid-devices=2 /dev/md0 > ``` > > For detail, please check each patch commit log. > > ------- > v4: > - revise subject & commit log on both patches > - no change for code > v3: > - patch 1/2 > - no change > - patch 2/2 > - use Xiao's solution to fix > - revise commit log for the "How to fix" part > v2: > - patch 1/2 > - change patch subject > - add test result in commit log > - no change for code > - patch 2/2 > - add test result in commit log > - add error handling of remove_disk in hot_remove_disk > - add error handling of lock_comm in all caller > - remove 5s timeout fix in receive side (for process_metadata_update) > v1: > - create patch > ------- > Zhao Heming (2): > md/cluster: block reshape with remote resync job > md/cluster: fix deadlock when node is doing resync job > > drivers/md/md-cluster.c | 69 +++++++++++++++++++++++------------------ > drivers/md/md.c | 14 ++++++--- > 2 files changed, 49 insertions(+), 34 deletions(-) >