On Sun, Oct 30, 2016 at 10:33:37AM +0100, Andreas Klauer wrote: > On Sat, Oct 29, 2016 at 07:16:14PM -0700, Marc MERLIN wrote: > > Can someone tell me how this is possible? > > More generally, is it possible for the kernel to return an md error > > and then not log any underlying hardware error on the drives the md > > was being read from? > > Is there something in mdadm --examine(-badblocks) /dev/sd*? Well, well, I learned something new today. First I had to upgrade my mdadm tools to get that option, and sure enough: myth:~# mdadm --examine-badblocks /dev/sd[defgh]1 Bad-blocks on /dev/sdd1: 14408704 for 352 sectors 14409568 for 160 sectors 132523032 for 512 sectors 372496968 for 440 sectors Bad-blocks list is empty in /dev/sde1 Bad-blocks on /dev/sdf1: 14408704 for 352 sectors 14409568 for 160 sectors 132523032 for 512 sectors 372496968 for 440 sectors Bad-blocks list is empty in /dev/sdg1 Bad-blocks list is empty in /dev/sdh1 So thank you for pointing me in the right direction. I think they are due to the fact that it's an external disk array on a port multiplier where sometimes I get bus errors that aren't actually on the disks. Questions: 1) shouldn't my array have been invalidated if I have bad blocks on 2 drives in the same place or is the only possible way for this to happen that it did get invalidated and I somehow force rebuilt the array to bring it back up and I don't remember doing so? (mmmh, but even so, rebuilding the spare should have cleared the bad blocks on at least one drive, no?) 2) I'm currently running this, which I believe is the way to recover: myth:~# echo 'check' > /sys/block/md5/md/sync_action but I'm not too hopeful on how that's going to work out if I have 2 drives with supposed bad blocks at the same offsets. Is there another way to just clear the bad block list on both drives if I've already verified that those blocks are not bad and that they were due to some I/O errors that came from a bad cable connection? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html