On 19/01/19 13:30, Carsten Aulbert wrote: > Hi > > On 1/19/19 2:21 PM, Basil Mohamed Gohar wrote: >> I have two drives of the 4-array RAID6 visible, but no files are >> accessible because it's a RAID6, I need at least 3 of the 4 drives >> working, and my problem is two are experiencing this problem. > > Hmm, that would be surprising, as RAID6 should offer a two disk > redundancy, i.e. any two disks may fail and you should still be able to > access your data - albeit without any extra safety net. That was my reaction - raid6 should survive two drive failures. Although I think *any* drive failure will result in the array failing to start until you force it - if it's been running degraded it will restart in the same configuration, if it degrades it won't restart without a force. Check that out. > >> This is challenging because it is in a tower array and all the drives >> connect straight to motherboard-like backplane. I took one out and was >> working with it directly via a USB SATA adapter, but that did not change >> the errors I was seeing. > > OK, I just wanted to make sure that the error "stayed" with the drives. > >> Yes, they are. SMART reports no fatal errors on the drives in questions! > > OK, at least that. > >> What may help me is if there are any tools for md devices that let me >> peek into the on-disk structure. Since the ext4 file system is spread >> across the 3 data drives in the array, I cannot use, for example, e2fsck >> on just one of them, and since I cannot properly assemble the drive, I >> am somewhat stuck. Are there any tools for examining an array of drives >> even if it is not recognized as such? I don't know, for example, if some >> sectors went bad, how to tell mdadm to look in alternate locations >> (i.e., akin to ext4's alternative superblock locations). > > As indicated above with RAID6 you should "only" have two data disks in a > four disk RAID6, as RAID6 does not write data copies but "generated" > parity stripes to the two extra disks, it can compute back what should > have been on data stripes on failed disks. But reverse engineering this > is probably not really easy to perform "manually". > > Thus, at first, we should really establish what the underlying layout > was, i.e. can you send us the output of /proc/mdstat? Might be too late for that. Two tools that are probably useful are Phil Turmel's lsdrv, and I saw wipefs mentioned a few days ago here - there's an option to do nothing that just gives you info. https://raid.wiki.kernel.org/index.php/Linux_Raid#When_Things_Go_Wrogn Cheers, Wol