Re: Two Drive Failure on RAID-5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message ----- From: "David Greaves" <david@xxxxxxxxxxxx>
To: "Cry" <cry_regarder@xxxxxxxxx>
Cc: <linux-raid@xxxxxxxxxxxxxxx>
Sent: Wednesday, May 21, 2008 10:15 PM
Subject: Re: Two Drive Failure on RAID-5


Cry wrote:
David Greaves <david <at> dgreaves.com> writes:
Cry wrote:
ddrescue /dev/SOURCE /dev/TARGET /somewhere_safe/logfile


unless you've rebooted:
blockdev --setrw /dev/SOURCE
blockdev --setra  <saved readahead value> /dev/SOURCE

mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
/dev/sde1

cat /proc/mdstat will show the drive status
mdadm --detail /dev/md0
mdadm --examine /dev/sd[abcdef]1 [components]

I performed the above steps, however I used dd_rescue instead of ddrescue.
Similar software. I think dd_rescue is more 'scripted' and less maintained.

]# dd_rescue -l sda_rescue.log -o sda_rescue.bad -v /dev/sda /dev/sdg1

doh!!
You copied the disk (/dev/sda) into a partition (/dev/sdg1)...


dd_rescue: (info): /dev/sda (488386592.0k): EOF
Summary for /dev/sda -> /dev/sdg1:
dd_rescue: (info): ipos: 488386592.0k, opos: 488386592.0k,
  xferd: 488386592.0k
                   errs:    504, errxfer:       252.0k,
  succxfer: 488386336.0k
             +curr.rate:    47904kB/s, avg.rate:    14835kB/s,
  avg.load:  9.6%
So you lost 252k of data. There may be filesystem corruption, a file may be corrupt or some blank diskspace may be even more blank. Almost impossible to tell.

The dd_rescue shows if the target device is full.
The errs number divisible by 8, i think its only bad sectors.

But let me note:
With the default -b 64k, dd_rescue sometimes drop the entire soft block area on the first error! If you want more precise result, run it again with -b 4096 and -B 1024, and if you can, don't copy the drive to the partition! :-)


[aside: It would be nice if we could take the output from ddrescue and friends
to determine what the lost blocks map to via the md stripes.]

/dev/sdg1 is my replacement drive (750G) that I had tried to sync
previously.
No. /dev/sdg1 is a *partition* on your old drive.

I'm concerned that running the first ddrescue may have stressed /dev/sda and
you'd lose data running it again with the correct arguments.

How do I transfer the label from /dev/sda (no partitions) to /dev/sdg1?
Can anyone suggest anything.

Cry i only have this idea:
dd_rescue -v -m 128k -r /dev/source -S 128k superblock.bin
losetup /dev/loop0 superblock.bin
mdadm --build -l linear --raid-devices=2 /dev/md1 /dev/sdg1 /dev/loop0

And the working raid member is /dev/md1. ;-)
But only for recovery!!!

(only idea, not tested.)

Cheers,
Janos


Cry don't do this...

I wonder about
dd if=/dev/sdg1 of=/dev/sdg
but goodness knows if it would work... it'd rely on dd reading from the start of the partition device and writes to the disk device not overlapping - which they
shouldn't but...

David
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux