Original thread on btrfs list (the OP links here didn't work for me): http://www.spinics.net/lists/linux-btrfs/msg53143.html On Mon, Mar 21, 2016 at 6:42 AM, Phil Turmel <philip@xxxxxxxxxx> wrote: > Hi Patrick, > > On 03/20/2016 06:37 PM, Andreas Klauer wrote: >> On Sun, Mar 20, 2016 at 10:44:57PM +0100, Patrick Tschackert wrote: >>> After rebooting the system, one of the harddisks was missing from my md raid 6 (the drive was /dev/sdf), so i rebuilt it with a hotspare that was already present in the system. >>> I physically removed the "missing" /dev/sdf drive after the restore and replaced it with a new drive. > > Your smartctl output shows pending sector problems with sdf, sdh, and > sdj. The latter are WD Reds that won't keep those problems through a > scrub, so I guess the smartctl report was from before that? >From what I understand, no, the smartctl are after the scrub check. The dmesg shows read errors but no md attempt to fix up those errors, which I thought was strange but might be a good thing if the raid is not assembled correctly. >> Your best bet is that the data is valid on n-2 disks. >> >> Use overlay https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file >> >> Assemble the overlay RAID with any 2 disks missing (try all combinations) and see if you get valid data. > > No. Something else is wrong, quite possibly hardware. You don't get a > mismatch count like that without it showing up in smartctl too, unless > corrupt data was being written to one or more disks for a long time. > > It's unclear from your dmesg what might have happened. Probably bad > stuff going back years. Seems unlikely because this was a functioning raid6 with Btrfs on top. So there'd have been a ton of Btrfs complaints. I think something wrong happened with the device replace procedure, I just can't tell what because all the devices are present and working according to the -D output. In that first message on the btrfs list you can see what things work and don't work in more detail. The summary is, all three Btrfs super blocks are found. This wouldn't be possible if the array weren't at least partially correct, and also the LUKS volume were being unlocked correctly. Unless there's something very nuanced and detailed we're not understanding yet. But as soon as commands are used to look for other things, there are immediate failures, lots of metadata checksum errors, an inability to read the chunk and root trees. So it's like there's a hole in the file system. I just can't tell if it's a small one like the size of a drive or a big one. > Otherwise you are at the mercy of fsck to try to fix your volume. I > would use an overlay for that. At this point I'm skeptical this will work. Also, I'm not familiar with this overlay technique. I did look at the URL provided by Andreas, my concern is whether it's possible for the volume UUID to appear more than once to the kernel? There are some very tricky things about Btrfs's dependency on volume UUID that can make it get confused where it should be writing when it sees more than one device with the same UUID. This is a problem with for example Btrfs on LVM, taking a snapshot of an LV, and both LV's being active means in effect two Btrfs instances with the same UUID and Btrfs can clobber them both in a bad way. https://btrfs.wiki.kernel.org/index.php/Gotchas I really think the Btrfs file system, based on the OP's description on the Btrfs list, is probably OK. The issue is raid6 assembly somehow being wonky. Even if it were double degraded by pulling any two suspect drives, I'd expect things to immediately get better and a nomodify 'btrfs check' will then come up clean. The OP had a clean shutdown. But it's an open question how long after device failure he actually noticed it before doing the rebuild and how he did that rebuild; and whether there's missing critical data on any of the other bad sectors on the three remaining drives. Chances are, those sectors don't overlap though. But at this point we need to hear back from Patrick. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html