Re: RAID5 crashed for unknown reason on old 2.6.16 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I got all my data back from a degraded RAID5 array with 3 disks.
The only point which is worth to mention: XFS as underlying file
system is ineligible for small/cheap NAS because it is not edian safe.
I bought a powerpc driven MAC to replay the XFS journal...

That leads to my question to the list: does somebody know if BTRFS is
endian safe or what is an endian-safe alternative to ext3/ext ?

Regards,
Markus



On Tue, Jun 29, 2010 at 8:50 AM, Neil Brown <neilb@xxxxxxx> wrote:
> On Mon, 28 Jun 2010 17:29:37 +0200
> Markus Hennig <mhennig@xxxxxxxxx> wrote:
>
>> Hi all,
>>
>> for the (unlikely) case somebody is interested in a last update:
>>
>> I learned in the meantime that the UUID as well as the mdadm version
>> is part of the checksum. And that that checksum is calculated on the
>> first 1kb of the 4kb ver0.0 superblock.
>> (https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#The_version-0.90_Superblock_Format)
>>
>> Via hexedit I set the UUID on HHD2 back to the correct value and also
>> changed the version information from 0.91.00 (0x5B) to 90 (0x5A).
>> Done that the checksum was correct and equal the expect one.
>>
>> mdadm --assemble worked than like a charm and my RAID5 is back.
>
> Thanks for letting us know the resolution.
> I cannot imagine how all those '1's got into the metadata where they
> shouldn't be.
>
> Based on the update times and event counter, the HDD2 was slightly 'older'
> than the other devices.  Hopefully nothing had changed on the array in the
> intervening time.
>
> You should have been able to assemble the array with just the 3 sane devices
> and had a degraded RAID5.  Then add the fourth device and let it recover.
>
> However what you did seems to have worked, so if your data looks OK, you
> should be safe.
>
> NeilBrown
>
>
>>
>> That's it,
>> Markus
>>
>>
>> On Sat, Jun 26, 2010 at 11:22 PM, Markus Hennig <mhennig@xxxxxxxxx> wrote:
>> > Hi all,
>> >
>> > my RAID5 with 4 disks crashed on a Buffalo "NAS" box (big-endian!) -
>> > no logs of course...
>> > I made immediately images of all disks and try to now gather my very
>> > valuable content on a Linux box running GRML 4/10 (little-endian!)
>> > with 2.6.33 and mdadm - v3.1.1.
>> > Some blocks were not readable from HDD2, maybe that's the reason why
>> > the Buffalo box shut down.
>> >
>> >
>> > What I know already:
>> >
>> > - the RAID5 was created with a very old set of software:
>> > linux-2.6.16-tshtgl.tgz   mdadm-2.5.2.tgz   xfsprogs-2.5.6_arm.tgz
>> > - the Buffalo box blinked red on HDD2
>> > - the box run a rebuild on HDD4, I don't know if that was already finished
>> > - all disks are identically, 250GB
>> >
>>
>> > Open questions for which I wasn't able to find a answer myself :
>> >
>> > What triggers the event count? And why is the event counter on HDD2
>> > just 129, on all other 131?
>> > Can that cause problems while rescue my data and how can I work around it?
>> >
>> >
>> > What is that "UUID : ffffffff:ffffffff:ffffffff:ffffffff" on HDD2?
>> > What does it mean?
>> >
>> > Its really in the superblock on the hard disk:
>> >  hexdump -s 488006273b -C hdd2_ddrescue
>> >  3a2cc50200  a9 2b 4e fc 00 00 00 00  00 00 00 5b 00 00 00 00
>> > |.+N........[....|
>> >  3a2cc50210  00 00 00 00 ff ff ff ff  41 a0 de f0 00 00 00 05
>> > |........A.......|
>> >  3a2cc50220  0e 83 39 c0 00 00 00 04  00 00 00 04 00 00 00 01
>> > |..9.............|
>> >  3a2cc50230  00 00 00 00 ff ff ff ff  ff ff ff ff ff ff ff ff
>> > |................|
>> >  3a2cc50240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
>> > |................|
>> > Would it help to rewrite the UUID via hexedit to the correct one?
>> >
>> >
>> > Can somebody explain the meaning of:
>> >  Reshape pos'n : 0
>> >      New Level : raid0
>> >     New Layout : left-asymmetric
>> >  New Chunksize : 0
>> > on HDD2 ?
>> >
>> >
>> > What parameters are included in the checksum?
>> > And how critical in on HHD2 that "Checksum : b8d2c453 - expected 45703820"?
>> >
>> >
>> > I have no explanation why "Version :" is on HDD2 on 0.91.00"...
>> > I see 0x5B in the partition 3 superblock on HDD2 (and on all other
>> > 0x5A), so its really on the disk...  Weird...
>> > Somebody any idea on that?
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux