Dear Linux RAID folks, I hope I did not annoy you too much on #linux-raid and I am contacting this list to reach a broader audience for help and for archival purposes. My message to the list dm-crypt [1] was a little long and so is this one. I am sorry. After having grown `/dev/sda2` using `fdisk /dev/sda2` with mdadm not running I forgot to grow the RAID1 and probably overwrote the md metadata (0.90) or made it unavaible because it was not at the end of the partition anymore after growing the physical and logical LVM volumes and filesystems. # blkid /dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member" /dev/sda2: UUID="cb6681b4-4dda-4548-8c59-3d9838de9c22" TYPE="crypto_LUKS" # In `fdisk` I had set it to »Linux raid autodetect« (0xfd) though. I could not boot anymore because `/dev/md1` could not be assembled. # mdadm --examine /dev/sda2 mdadm: No md superblock detected on /dev/sda2. # mdadm --examine /dev/sda1 /dev/sda1: Magic : a92b4efc Version : 0.90.00 UUID : fb7f3dc5:d183cab6:12123120:1a2207b9 Creation Time : Wed Mar 26 11:49:57 2008 Raid Level : raid1 Used Dev Size : 497856 (486.27 MiB 509.80 MB) Array Size : 497856 (486.27 MiB 509.80 MB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 0 Update Time : Wed Aug 3 21:11:43 2011 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 388e903a - correct Events : 20332 Number Major Minor RaidDevice State this 0 8 1 0 active sync /dev/sda1 0 0 8 1 0 active sync /dev/sda1 1 1 0 0 1 faulty removed # mdadm --verbose --assemble /dev/md1 --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c mdadm: looking for devices for /dev/md1 mdadm: no recogniseable superblock on /dev/dm-8 mdadm: /dev/dm-8 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-7 mdadm: /dev/dm-7 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-6 mdadm: /dev/dm-6 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-5 mdadm: /dev/dm-5 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-4 mdadm: /dev/dm-4 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-3 mdadm: /dev/dm-3 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-2 mdadm: /dev/dm-2 has wrong uuid. mdadm: no recogniseable superblock on /dev/dm-1 mdadm: /dev/dm-1 has wrong uuid. mdadm: cannot open device /dev/dm-0: Device or resource busy mdadm: /dev/dm-0 has wrong uuid. mdadm: no recogniseable superblock on /dev/md0 mdadm: /dev/md0 has wrong uuid. mdadm: cannot open device /dev/loop0: Device or resource busy mdadm: /dev/loop0 has wrong uuid. mdadm: cannot open device /dev/sdb4: Device or resource busy mdadm: /dev/sdb4 has wrong uuid. mdadm: cannot open device /dev/sdb: Device or resource busy mdadm: /dev/sdb has wrong uuid. mdadm: cannot open device /dev/sda2: Device or resource busy mdadm: /dev/sda2 has wrong uuid. mdadm: cannot open device /dev/sda1: Device or resource busy mdadm: /dev/sda1 has wrong uuid. mdadm: cannot open device /dev/sda: Device or resource busy mdadm: /dev/sda has wrong uuid. and `mdadm --examine /dev/sda2` could not find any metadata. `/dev/sda2` could still be decrypted using `cryptsetup luksOpen /dev/sda2 sda2_crypt`. Not knowing about metadata and their storage (0.90) I read several Web resourses and joined IRC channels and came to the conclusion that I should just create a new (degraded) RAID1 and everything would be fine, since I had only one disk. Booting from the live system Grml [3], which does *not* start `mdadm` or `lvm` during boot, I tried to create a new RAID1 using the following command (a). # command (a) mdadm --verbose --create /dev/md1 \ --assume-clean \ --level=1 \ --raid-devices=2 \ --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c \ /dev/sda2 missing I ignored the warning about overwriting metadata because it only referred to booting. Unfortunately `cryptsetup luksOpen /dev/md1 md1_crypt` did not find any LUKS superblock. Therefore I stopped `/dev/md1` and `cryptsetup luksOpen /dev/sda2 sda2_crypt` still worked. Then I remembered that the metadata version was originally 0.90 and added `--metadata=0.90` and executed the following (b). # command (b) mdadm --verbose --create /dev/md1 \ --assume-clean \ --level=1 \ --raid-devices=2 \ --uuid=52ff2cf2:40981859:e58d8dd6:5faec42c \ --metadata=0.90 /dev/sda2 missing Lucky me I thought, `cryptsetup luksOpen /dev/md1 md1_crypt` asked me for the passphrase but I entered it three times and it would not unlock. Instead of trying it again – I do not know if it would have worked – I tried `cryptsetup luksOpen /dev/sda2 sda2_crypt` and it asked me for the passphrase too. The third time I seem to have entered it correctly, but I got an error message that it could not be mapped. --- dmesg --- Aug 4 00:16:01 grml kernel: [ 7964.786362] device-mapper: table: 253:0: crypt: Device lookup failed Aug 4 00:16:01 grml kernel: [ 7964.786367] device-mapper: ioctl: error adding target to table Aug 4 00:16:01 grml udevd[2409]: inotify_add_watch(6, /dev/dm-0, 10) failed: No such file or directory Aug 4 00:16:01 grml udevd[2409]: inotify_add_watch(6, /dev/dm-0, 10) failed: No such file or directory Aug 4 00:17:14 grml kernel: [ 8038.196371] md1: detected capacity change from 1999886286848 to 0 Aug 4 00:17:14 grml kernel: [ 8038.196395] md: md1 stopped. Aug 4 00:17:14 grml kernel: [ 8038.196407] md: unbind<sda2> Aug 4 00:17:14 grml kernel: [ 8038.212653] md: export_rdev(sda2) --- dmesg --- Then I realized that I had probably forgotten to stop `/dev/md1`. After stopping it, `cryptsetup luksOpen /dev/sda2 sda2_crypt` did not succeed anymore and I cannot access my data. 1. Does the `dmesg` output suggest that accessing `/dev/sda2` while assembled caused any breakage? 2. On #lvm and #linux-raid the common explanation was that command (a) had overwritten the LUKS superblock and damaged it. Is that possible? I could not find the magic number 0xa92b4efc in the first megabyte of `/dev/sda2`. Did `--assume-clean` prevent that? 3. Is command (b) to blame, or did it probably work and I had a typo in the passphrase? I am thankful for any hint to get my data back. Thanks and sorry for the long message. Any hints on how to shorten it next time are much appreciated. Paul PS: A month ago I head `dd` the content of a 500 GB drive to this one. That is why I wanted to resize the partitions. The old drive is still functional and I am attaching several outputs from commands from the current 2 TB drive and the old drive. The `luksDump` output is from the current drive but with the LUKS header from the 500 GB drive. I know that I am publishing the key to access my drive, but if it helps to get my data back I will encrypt from scratch again afterward. I also have the dump of the first MB (in this case) of the partition (`luksHeaderBackup`) from the old and new drive. But attaching them would be over the message size limit. [1] http://www.saout.de/pipermail/dm-crypt/2011-August/001857.html [2] http://www.hermann-uwe.de/blog/resizing-a-dm-crypt-lvm-ext3-partition [3] http://grml.org/
Accesing dm-crypt volume after failed resizing with mdadm/RAID1, dm-crypt/LUKS, LVM Dear dm-crypt folks, as you might guess I am another lost guy turning to you as the last resort to rescue his data. I am sorry for the long text, but I am trying to be as elaborate as possible. I have a RAID1 (mirroring) setup where only one drive is assembled though. It is setup `/dev/md0` ← `/dev/sda1` and `/dev/md1` ← `/dev/sda2`. `/dev/md1` is encrypted with LUKS and contains a LVM setup with the logical volumes used by `/home/` and `/root/`. A month ago the 500 GB drive was replaced by a 2 TB drive and I copied the whole data with `dd_rescue` without any errors from the old to the new drive. As we all know as a consequence only the old size of 500 GB is usable and the partitions have to be resized/grown to be able to use the whole 2 TB. But to emphasize it again, I still do have the old drive available. I wanted to resize the partitions today. Therefore I followed the guide from Uwe Hermann [1] which I had done also some years ago where it had worked without any problem. So I booted from a USB medium with Grml 5.2011 (cryptsetup 1.3.0) and followed the steps from the guide. Please note that Grml by default does not assemble any RAIDs, that means `mdadm` was not run. (And having only one drive I did not think about that the RAID1 might have been taken care for too.) 1. `fdisk /dev/sda` 2. Remove second partition. 3. Create new partition with the same starting sector (automatically 63 was chosen, since there are only two partitions and it was the second). 4. Choose proposed end sector, which was the maximum. 5. Choose type `autoraid detection`. 6. Saved it using w. Afterward I did `cryptsetup luksOpen /dev/sda2 foo`, `pvresize /dev/mapper/foo`, `service lvm2 start` and `lvresize -L +300GB /dev/speicher/home` and `lvresize -L +20GB /dev/speicher/other`. Then I ran `fsck -f /dev/speicher/home` and `xfs_check /mnt/other_mounted` and there were no errors at all. Doing `resize2fs /dev/speicher/home` and `xfs_resize /mnt/other_mounted`(?) I rebooted just to be surprised that I was not asked for the LUKS password when booting into Debian. I only saw `evms_activate is not available`. Then I booted with Grml again to recreated the initrd.img, `update-initramfs -u` thinking it needed to be updated too. I was happy to see that I could still access `/dev/sda2` just fine using `cryptsetup luksOpen /dev/sda2 sda2_crypt` and to mount everything in it – `service lvm2 start` all volumes – for using `chroot` [3] and rebuild the initrd image. But updating the initrd image was to no avail although the `evms_activate is not available message` disappeared. Here I probably also have to mention that I have `mdadm` on hold on the Debian system for quite some time because of some problems and I did not dare to touch it. Anyway I found out that the system was not able to assemble `/dev/md1` from `/dev/sda2`. This did also not work under Grml and `mdadm` could not find the md superblock on `/dev/sda2`. # mdadm --examine /dev/sda2 mdadm: No md superblock detected on /dev/sda2. # blkid /dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member" /dev/sda2: UUID="cb6681b4-4dda-4548-8c59-3d9838de9c22" TYPE="crypto_LUKS" # different UUID than before and “wrong” type # cryptsetup luksOpen /dev/sda2 sda2_crypt # still worked On `#debian` somebody told me, that the md superblock is stored on the end of the partition and that it was probably overwritten when enlarging the partition and I should have growm the RAID too. Searching the Internet for help I found several suggestions and I tried to recreate the RAID with the following command. # mdadm --create /dev/md1 --assume-clean --uuid=52ff2cf2-4098-1859-e58d-8dd65faec42c /dev/sda2 missing I got a warning that there is metadata at the beginning and that I should not go on when this is used for `/boot` and use `--metadata=0.90`. But since it was not used for `/boot` I chose to go on. Then the RAID was created but `cryptsetup luksOpen /dev/md1 md1_crypt` said that it was no LUKS device. Therefore I stopped the RAID and `cryptsetup luksOpen /dev/sda2 sda2_crypt` still worked. Then I was told on IRC that when only having one drive in a RAID1 it does not matter if you alter `/dev/sda2` or `/dev/md1` and that I should try to create the RAID again. Remembering that before the resizing the RAID metadata (also on /dev/sda1) was `0.90` I passed `--metadata=0.90` to the `mdadm --create` command. # mdadm --create /dev/md1 --assume-clean --uuid=52ff2cf2-4098-1859-e58d-8dd65faec42c /dev/sda2 missing I got an error message that the device is already part of a RAID and I ignored it and went on. I first was happy because # cryptsetup luksOpen /dev/md1 md1_crypt worked and asked me for the passphrase. But I typed the correct passphrase several times and it was rejected. Then I probably forgot to stop the RAID and # cryptsetup luksOpen /dev/sda2 sda2_crypt showed the same behavior but that was probably typos and it seemed to work once. But I got an error message which is the following making me realize the RAID was probably still running and I stopped it right away. Aug 4 00:16:01 grml kernel: [ 7964.786362] device-mapper: table: 253:0: crypt: Device lookup failed Aug 4 00:16:01 grml kernel: [ 7964.786367] device-mapper: ioctl: error adding target to table Aug 4 00:16:01 grml udevd[2409]: inotify_add_watch(6, /dev/dm-0, 10) failed: No such file or directory Aug 4 00:16:01 grml udevd[2409]: inotify_add_watch(6, /dev/dm-0, 10) failed: No such file or directory Aug 4 00:17:14 grml kernel: [ 8038.196371] md1: detected capacity change from 1999886286848 to 0 Aug 4 00:17:14 grml kernel: [ 8038.196395] md: md1 stopped. Aug 4 00:17:14 grml kernel: [ 8038.196407] md: unbind<sda2> Aug 4 00:17:14 grml kernel: [ 8038.212653] md: export_rdev(sda2) After that `cryptsetup luksOpen /dev/sda2 sda2_crypt` always failed. Now wanting to be smart I saved the LUKS header # cryptsetup luksHeaderBackup /dev/sda2 --header-backup-file /home/grml/20110804--031--luksHeaderBackup shut the system down, connected the old drive, booted Grml, saved the LUKS header from `/dev/sda2` from the 500 GB drive, switched the drives again and restoring the old header from before the resizing to the new drive. # cryptsetup luksHeaderRestore /dev/sda2 --header-backup-file /home/grml/20110804--031--luksHeaderBackup Only to find out that this did also not help. I have some system information from the late recovery attemps and still the old 500 GB drive. Is there any way to recover the data? The current situation is, that `luksOpen` does not succeed on `/dev/md1` or `/dev/sda2`, that means it is detected as a LUKS device but the passphrase is not accepted (I even typed it clear to the console and copied it into the prompt). ### New drive ### # mdadm --examine /dev/sda2 /dev/sda2: Magic : a92b4efc Version : 0.90.00 UUID : 52ff2cf2:40981859:d8b78f65:99226e41 (local to host grml) Creation Time : Thu Aug 4 00:05:57 2011 Raid Level : raid1 Used Dev Size : 1953013952 (1862.54 GiB 1999.89 GB) Array Size : 1953013952 (1862.54 GiB 1999.89 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Update Time : Thu Aug 4 00:05:57 2011 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : bf78bfbf - correct Events : 1 Number Major Minor RaidDevice State this 0 8 2 0 active sync /dev/sda2 0 0 8 2 0 active sync /dev/sda2 1 0 0 0 0 spare # blkid /dev/sda1: UUID="fb7f3dc5-d183-cab6-1212-31201a2207b9" TYPE="linux_raid_member" /dev/sda2: UUID="52ff2cf2-4098-1859-d8b7-8f6599226e41" TYPE="linux_raid_member" ### Old drive ### /dev/sda2: Magic : a92b4efc Version : 0.90.00 UUID : 52ff2cf2:40981859:e58d8dd6:5faec42c Creation Time : Wed Mar 26 11:50:04 2008 Raid Level : raid1 Used Dev Size : 487885952 (465.28 GiB 499.60 GB) Array Size : 487885952 (465.28 GiB 499.60 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 1 Update Time : Sat Jun 18 14:25:10 2011 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 380692fc - correct Events : 25570832 Number Major Minor RaidDevice State this 0 8 2 0 active sync /dev/sda2 0 0 8 2 0 active sync /dev/sda2 1 1 0 0 1 faulty removed Please tell me what other information you need. Thanks in advance, Paul PS: Please excuse this long message and probably mistakes in it. It is almost four in the morning and after 10 hours debugging I am quite lost. [1] http://www.hermann-uwe.de/blog/resizing-a-dm-crypt-lvm-ext3-partition [2] http://grml.org/ [3] http://wiki.debian.org/DebianInstaller/Rescue/Crypto
Attachment:
20110804--new-drive--blkid
Description: Binary data
Attachment:
20110804--new-drive--fdisk-l-sda
Description: Binary data
Attachment:
20110804--new-drive--fdisk-s-sda1
Description: Binary data
Attachment:
20110804--new-drive--fdisk-s-sda2
Description: Binary data
Attachment:
20110804--new-drive--mdadm--examine-sda1
Description: Binary data
Attachment:
20110804--new-drive--mdadm--examine-sda2
Description: Binary data
Attachment:
20110804--new-drive--sfdisk-d-sda
Description: Binary data
Attachment:
20110804--new-drive--with-header-from-old-drive--cryptsetup-luksDump
Description: Binary data
Attachment:
20110804--old-drive--blkid
Description: Binary data
Attachment:
20110804--old-drive--fdisk-l
Description: Binary data
Attachment:
20110804--old-drive--fdisk-s-sda1
Description: Binary data
Attachment:
20110804--old-drive--fdisk-s-sda2
Description: Binary data
Attachment:
20110804--new-drive--sfdisk-d-sda
Description: Binary data
Attachment:
20110804--old-drive--mdadm--examine-sda1
Description: Binary data
Attachment:
20110804--old-drive--mdadm--examine-sda2
Description: Binary data
Attachment:
20110804--old-drive--sfdisk-d-sda
Description: Binary data