Dear all,
Sorry about the break; I've been away and doing other things...
On Sun, 06 Jan 2013 12:59:07 -0000, Alex Leach <beamesleach@xxxxxxxxx>
wrote:
I think my only option is to zero the superblock with mdadm and try to
recreate the array in Windows, with
whatever version of Intel Matrix Storage Manager was initially installed
on the machine, hoping to God that the array contents don't get
overwritten. Then, hopefully the original array size would be available
again and the ext4 partition would fit within it. Sounds dangerous...
So, that is what I did. Specifically:-
0. Used serial numbers from /dev/disk/by-id to figure out the original
order that the member disks were plugged into the motherboard. Swapped
/dev/sdg and /dev/sdb, so that they were plugged into the motherboard
ports in the same order as when I first created the array.
1. mdadm --zero-superblock /dev/sd[abg]
2. Re-create RAID5 array in Intel Matrix Storage Manager 8.9.
This just initialised the container and member array i.e. wrote the
imsm container superblock. The re-sync was pending.
3. Reboot into Arch linux.
mdadm started re-sync'ing the array. I let it finish...
4. mdadm -D /dev/md/RAID5 | grep Size
Array Size : 586066944 (558.92 GiB 600.13 GB)
Used Dev Size : 293033600 (279.46 GiB 300.07 GB)
Chunk Size : 64K
Assuming the above units are in Kibibytes, I figure multiplying the Array
Size by 2 should give the usable number of sectors: 1,172,133,888.
The array size and used dev size is now as it was before everything went
tits up. That's good, but testdisk still finds that the partitions have
moved up the disk. The discovered ext4 partition now extends 3,576 sectors
beyond the end of the array.
In sectors, these are the partitions testdisk finds:
Partition Start End (Diff.) Size
1 2,128 206,920 +73 204,800
2 214,528 668,208,632 +7,673 667,994,112
3 668,208,640 1,172,137,464 +7,680 503,928,832
where Diff is the number of sectors the partition seems to have moved, cf.
the original partition table.
---------------------
Partition 2 is still recoverable. I can browse the array contents using
testdisk and can see that the Windows directory structure is still there.
Partition 1 seems corrupted. I can't browse the contents in testdisk and
was unable to mount the partition, even before this last array re-creation.
Partition 3 is still a problem. testdisk refuses to recover it and I can't
browse its contents.
I seem to have a new problem, too. I am now unable to write a partition
table to the disk! I've tried using sfdisk, parted and testdisk. Each of
these programs hangs indefinitely and the process is invincible to kill
commands. The machine needs to be hard-rebooted in order to kill the
hanging process.
Two minutes after trying to write a partition table, dmesg reports the
following traceback each and every 2 minutes:
[ 479.779519] INFO: task sfdisk:1020 blocked for more than 120 seconds.
[ 479.779522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 479.779525] sfdisk D 0000000000000001 0 1020 1019
0x00000000
[ 479.779529] ffff8805e9cb3978 0000000000000086 ffff8805e9cb3801
ffff8805e9cb3fd8
[ 479.779534] ffff8805e9cb3948 ffff8805e9cb3fd8 ffff8805e9cb3fd8
ffff8805e9cb3fd8
[ 479.779538] ffff88061e0c5040 ffff8805e9e76450 ffff8805e9cb3908
ffff8805edd86200
[ 479.779542] Call Trace:
[ 479.779551] [<ffffffff81164024>] ?
__mem_cgroup_commit_charge+0xd4/0x350
[ 479.779558] [<ffffffff8107b79b>] ? prepare_to_wait+0x5b/0x90
[ 479.779569] [<ffffffffa03efc55>] md_write_start+0xb5/0x1a0 [md_mod]
[ 479.779573] [<ffffffff8107b9a0>] ? abort_exclusive_wait+0xb0/0xb0
[ 479.779577] [<ffffffffa02bad0b>] make_request+0x3b/0x6c0 [raid456]
[ 479.779582] [<ffffffff81067e68>] ? lock_timer_base.isra.38+0x38/0x70
[ 479.779585] [<ffffffff81067400>] ? internal_add_timer+0x20/0x50
[ 479.779591] [<ffffffffa03eac6c>] md_make_request+0xfc/0x240 [md_mod]
[ 479.779595] [<ffffffff81113075>] ? mempool_alloc_slab+0x15/0x20
[ 479.779601] [<ffffffff8122a9c2>] generic_make_request+0xc2/0x110
[ 479.779604] [<ffffffff8122aa89>] submit_bio+0x79/0x160
[ 479.779609] [<ffffffff811a2c15>] ? bio_alloc_bioset+0x65/0x120
[ 479.779612] [<ffffffff8119d1e5>] submit_bh+0x125/0x210
[ 479.779616] [<ffffffff811a0210>] __block_write_full_page+0x1f0/0x360
[ 479.779620] [<ffffffff8119e0b0>] ? end_buffer_async_read+0x200/0x200
[ 479.779623] [<ffffffff811a3df0>] ? I_BDEV+0x10/0x10
[ 479.779627] [<ffffffff811a3df0>] ? I_BDEV+0x10/0x10
[ 479.779630] [<ffffffff8119e0b0>] ? end_buffer_async_read+0x200/0x200
[ 479.779633] [<ffffffff811a0466>] block_write_full_page_endio+0xe6/0x130
[ 479.779637] [<ffffffff811a04c5>] block_write_full_page+0x15/0x20
[ 479.779641] [<ffffffff811a4478>] blkdev_writepage+0x18/0x20
[ 479.779644] [<ffffffff81119ac7>] __writepage+0x17/0x50
[ 479.779647] [<ffffffff81119f92>] write_cache_pages+0x1f2/0x4e0
[ 479.779650] [<ffffffff81119ab0>] ? global_dirtyable_memory+0x40/0x40
[ 479.779655] [<ffffffff8116c047>] ? do_sync_write+0xa7/0xe0
[ 479.779658] [<ffffffff8111a2ca>] generic_writepages+0x4a/0x70
[ 479.779662] [<ffffffff8111bace>] do_writepages+0x1e/0x40
[ 479.779666] [<ffffffff81111a49>] __filemap_fdatawrite_range+0x59/0x60
[ 479.779669] [<ffffffff81111b50>] filemap_write_and_wait_range+0x50/0x70
[ 479.779673] [<ffffffff811a4744>] blkdev_fsync+0x24/0x50
[ 479.779676] [<ffffffff8119b6dd>] do_fsync+0x5d/0x90
[ 479.779680] [<ffffffff8116ca62>] ? sys_write+0x52/0xa0
[ 479.779683] [<ffffffff8119ba90>] sys_fsync+0x10/0x20
[ 479.779688] [<ffffffff81495edd>] system_call_fastpath+0x1a/0x1f
Apologies for not having all the debug symbols installed. I've been unable
to locate the necessary package/s on Arch that contain them.
This is with mdadm-3.2.6. I think I'll try again with an old version of
dmraid on recovabuntu, one released around the time I first created the
ext4 partition, which was in Ubuntu.
As to why I get this traceback, no idea.
---------------------------
I'd really appreciate some suggestions on how to recover the ext4
partition. My current idea is this:
1. Use ddrescue to copy the hard disk from sector 668,208,640 (location
testdisk found) up to the end.
$ ddrescue obs=512 seek=668208640 bs=64 if=/dev/md/RAID5
of=/media/bigdisk/kubuntu.ext4
2. Probably get a new HDD, create an MBR in Windows 7, and make a new
partition exactly the same size as before.
$ [cs]?fdisk ...
Not exactly sure what information I'd need to specify here, other than
size, primary and partition Id (83). Use whichever program allows all the
necessary options.
Assume, just created partition /dev/sdXx.
3. Copy the backed up image on to the raw device / partition file.
$ dd bs=64 if=/media/bigdisk/kubuntu.ext4 of=/dev/sdXx
4. Hope fsck works...
$ fsck.ext4 -f -p /dev/sdXx
5. Hopefully mount the partition and browse data.
6. Re-create and re-format partitions on the RAID array and copy stuff
back.
7. Get an incremental back-up system running.
Does anyone know if there is even a remote possibility this could work?
I've read through some of the kernel wiki on ext4 partitions
(https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout), but doubt my
ability to use any of that information effectively.
[Question - any ext4 gurus here who would know?]
Could I increase the size of the output file of step 1, so that it appears
to be the same size as the original partition? i.e. If I were to pad it
with zeros, could I attempt to mount the file as an ext4 partition?
Something like:
$ dd if=/dev/zeros bs=512 count=3576 >> /media/bigdisk/kubuntu.ext4
$ e2fsck /media/bigdisk/kubuntu.ext4
$ mount -t ext4 /media/bigdisk/kubuntu.ext4 /mnt
Any comments, suggestions or insults completely welcome.
Kind regards,
Alex
P.S. Sorry that I've been unable to report any hexdump results. Basically,
I don't (yet) have a clue how I'd locate the ext3 file headers, or what to
expect them to look like. Over the course of this week, I'll try and make
my way through this ext3 recovery tutorial:
http://carlo17.home.xs4all.nl/howto/undelete_ext3.html
--
Using Opera's mail client: http://www.opera.com/mail/
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html