On Sun, 26 Apr 2015 20:56:58 -0500 David Wahler <dwahler@xxxxxxxxx> wrote: > [oops, forgot to cc the list] > > On Sun, Apr 26, 2015 at 8:20 PM, NeilBrown <neilb@xxxxxxx> wrote: > >> And the output of mdadm --detail/-E: > >> https://gist.github.com/anonymous/0b090668b56ef54bb2f0 > > > > What is wrong with simply including this directly in the email??? > > My bad; I wasn't sure whether it was appropriate to paste such a long > dump inline. > > > Anyway: > > > > Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present. > > > > that is the only thing that looks at all interesting. Particularly the last > > 3 words. > > What does > > mdadm --examine-badblocks /dev/sd[cde]1 > > show? > > root@ceres:~# mdadm --examine-badblocks /dev/sd[cde]1 > Bad-blocks on /dev/sdc1: > 3699640928 for 32 sectors > Bad-blocks on /dev/sdd1: > 3699640928 for 32 sectors > Bad-blocks on /dev/sde1: > 3699640928 for 32 sectors > > Hmm, that seems kind of odd to me. For what it's worth, all four > drives passed a SMART self-test, and "dd > /dev/null" completed > without errors on all of them. I just read about the "badblocks" tool > and I'm running it now. The array is reshaping a RAID6 from 4->5 devices, so that is 2 data disks to 3 data disks. Reshape pos'n : 5548735488 (5291.69 GiB 5681.91 GB) so it is about 5.3TB through the array, so it has read about 2.6TB from the devices and written about 1.7TB to the devices. 3699640928 sectors is about 1.8TB. That seems a little too close to be a co-incidence. Maybe when reshape write to somewhere that is a bad-block, it gets confused. On the other hand, when it copies from the 1.8TB address to the 1.2TB address it should have record that it was bad-blocks that were being copied. It doesn't seem like it did. I'll have to look at the code and try to figure out what is happening. I don't think there is anything useful you can do in the mean time... NeilBrown
Attachment:
pgppKL7PNVMcV.pgp
Description: OpenPGP digital signature