On 4 October 2016 at 23:14, Slava Prisivko <vprisivko@gmail.com> wrote: >> vgextend --restoremissing > > I didn't have to, because all the PVs are present: > > # pvs > PV VG Fmt Attr PSize PFree > /dev/sda2 vg lvm2 a-- 1.82t 1.10t > /dev/sdb2 vg lvm2 a-- 3.64t 1.42t > /dev/sdc2 vg lvm2 a-- 931.51g 195.18g Double-check in the metadata for MISSING. This is what I was hoping might be in your /etc/lvm/backup file. >> Actually, always run LVM commands with -v -t before really running them. > > Thanks! I had backed up the rmeta* and rimage*, so I didn't feel the need > for using -t. Am I wrong? Well, some nasty surprises may be avoidable (particularly if also using -f). > Yes, I've noticed it. The problem was a faulty SATA cable (as I learned > later), so when I switched the computer on for the first time, /dev/sda was > missing (in the current device allocation). I switched off the computer, > swapped the /dev/sda and /dev/sdb SATA cable (without thinking about the > consequences) and switched it on. This time the /dev/sdb was missing. I > replaced the faulty cable with a new one and switched the machine back on. > This time sda, sdb and sdc were all present, but the RAID went out-of-sync. In swapping the cables, you may have changed the sd{a,b,c} enumeration but this will have no impact on the UUIDs that LVM uses to identify the PVs. > I'm pretty sure there were very few (if any) writing operations during the > degraded operating mode, so the I could recover by rebuilding the old mirror > (sda) using the more recent ones (sdb and sdc). Agreed, based on your check below. > Thanks, I used your raid5_parity_check.cc utility with the default stripe > size (64 * 1024), but it actually doesn't matter since you're just > calculating the total xor and the stripe size acts as a buffer size for > that. [I was little surprised to discover that RAID 6 works as a byte erasure code.] The stripe size and layout matters once if you want to adapt the code to extract or repair the data. > I get three unsynced stripes out of 512 (32 mib / 64 kib), but I would like > to try to reconstruct the test_rimage_1 using h the other two. Just in case, > here are the bad stripe numbers: 16, 48, 49. I've updated the utility (this is for raid5 = raid5_ls). Warning: not tested on out-of-sync data. https://drive.google.com/open?id=0B8dHrWSoVcaDYXlUWXEtZEMwX0E # Assume the first sub LV has the out-of-date data and dump the correct(ed) LV content. ./foo stripe $((64*1024)) repair 0 /dev/${lv}_rimage_* | cmp - /dev/${lv} >> > The output of various commands is provided below. >> > >> > # lvs -a -o +devices >> > >> > test vg rwi---r--- 64.00m >> > test_rimage_0(0),test_rimage_1(0),test_rimage_2(0) >> > [test_rimage_0] vg Iwi-a-r-r- 32.00m /dev/sdc2(1) >> > [test_rimage_1] vg Iwi-a-r-r- 32.00m >> > /dev/sda2(238244) >> > [test_rimage_2] vg Iwi-a-r-r- 32.00m >> > /dev/sdb2(148612) >> > [test_rmeta_0] vg ewi-a-r-r- 4.00m /dev/sdc2(0) >> > [test_rmeta_1] vg ewi-a-r-r- 4.00m >> > /dev/sda2(238243) >> > [test_rmeta_2] vg ewi-a-r-r- 4.00m >> > /dev/sdb2(148611) The extra r(efresh) attributes suggest trying a resync operation which may not be possible on inactive LV. I missed that the RAID device is actually in the list. > After cleaning the dmsetup table of test_* and trying to lvchange -ay I get > practically the same: > # lvchange -ay vg/test -v [snip] > device-mapper: reload ioctl on (253:87) failed: Invalid argument > Removing vg-test (253:87) > > device-mapper: table: 253:87: raid: Cannot change device positions in RAID > array > device-mapper: ioctl: error adding target to table This error occurs when the sub LV metadata says "I am device X in this array" but dmsetup is being asked to put the sub LV at different position Y (alas, neither are logged). With lots of -v and -d flags you can get lvchange to include the dm table entries in the diagnostics. You can check the rmeta superblocks with https://drive.google.com/open?id=0B8dHrWSoVcaDUk0wbHQzSEY3LTg > Here is the relevant /etc/lvm/archive (archive is more recent that backup) That looks sane, but you omitted the physical volumes section so there is no way to cross-check UUIDs and devices or see if there are MISSING flags. If you use https://drive.google.com/open?id=0B8dHrWSoVcaDQkU5NG1sLWc5cjg directly, you can get metadata that LVM is reading off the PVs and double-check for discrepancies. _______________________________________________ linux-lvm mailing list linux-lvm@redhat.com https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/