Chris Dunlop <chris@xxxxxxxxxxxx> 于2018年10月25日周四 下午2:52写道: > > Hi, > > kernel: v4.18.16 > mdadm: current HEAD, 5d518de Hi, Possible related to https://git.kernel.org/pub/scm/linux/kernel/git/shli/md.git/commit/?h=for-next&id=059421e041eb461fb2b3e81c9adaec18ef03ca3c Can you try to apply the patch to your kernel to see if it fixes your problem? Regards, Jack > > It looks like there's some issue between scrubbing and writing to > suspend_lo and suspend_hi, e.g. an inverted lock or missed wakeup etc. > > I had an md lockup on a production machine when running a scrub and > raid6check on the same md device at the same time. I eventually had to > reset the box to recover. > > At the point of the hang md1_raid6 was chewing up a lot of cpu but not > making any progress per /sys/block/md1/md/sync_completed), and the > raid6check was unkillable (kill -9 didn't work) and lsof showed it had > /sys/devices/virtual/block/md1/md/suspend_lo open for write. > > The raid6check code writes to suspend_lo and suspend_hi in it's > lock_stripe() routine, to lock each stripe in turn as it works it's way > through the md device. > > I'm able to reproduce the lockup on a debian9 single cpu kvm virtual > machine in 2-10 rounds of the reproducer below. > > The reproducer prints dots at intervals on the order of a few seconds. If > the problem is hit, the dots stop coming. At that point the shell should > have suspend_lo or suspend_hi open for write, and will be unkillable. > > Cheers, > > Chris > > ---------------------------------------------------------------------- > # > # Setup > # > # Create 6 x 11-dev raid6, wait for sync to finish > # > function test_setup > { > for md in md{1..6}; do > echo "creating ${md}" > for i in {1..11}; do > f=/var/tmp/${md}-vdev${i} > truncate -s 2G "${f}" > loop[$i]=$(losetup -f) > losetup "${loop[$i]}" "${f}" > done > mdadm --create "/dev/${md}" --level=6 --raid-disks=11 "${loop[@]}" > done > while grep resync /proc/mdstat; do sleep 2; done > cat /proc/mdstat > } > > # > # Reproducer > # > # Continuous scrub of all mds, and lock successive stripes of md1 per > # raid6check:lock_stripe() > # > function test_run > { > declare -i component_size=$(($(</sys/block/md1/md/component_size) * 1024)) # KB to bytes > declare -i chunk_size=$(</sys/block/md1/md/chunk_size) > declare -i stripes=$((component_size / chunk_size)) > declare -i data_disks=$(($(</sys/block/md1/md/raid_disks) - 2)) > declare -i i=0 j stripe > > while : ; do > i=$((i + 1)) > date +"%F-%T Round $i" > > # > # Start scrub on all mds > # > for md in md{1..6}; do > echo check > "/sys/block/${md}/md/sync_action" > done > sleep 2 > > # > # keep writing to md1 suspend_{lo,hi} as raid6check does > # > j=0 > while grep -q check /proc/mdstat; do > j=$((j + 1)) > echo -e " $j \c" > stripe=0 > while [[ stripe -le stripes ]] ; do > [[ $((stripe % 10)) -eq 0 ]] && echo -e '.\c' > echo $((stripe * chunk_size * data_disks)) > /sys/devices/virtual/block/md1/md/suspend_lo > echo $(((stripe + 1) * chunk_size * data_disks)) > /sys/devices/virtual/block/md1/md/suspend_hi > sleep 0.2 > stripe+=1 > done > echo > done > done > } > ----------------------------------------------------------------------