Re: isw device for volume broken after opensuse livecd boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks that's what I originally thought.

I have now confirmed the bug in the opensuse 10.3 RC2 livecd / dmraid
or device-mapper / OROM.

I deleted the RAID0 volume, disconnected the third drive that wasn't
part of the raid and chose one of the hitachis for RAID1 rebuild.
Then I created a new RAID0 volume, so the raid0 was online and raid1 degraded.

I then booted the opensuse livecd and just entered a root terminal and
typed "dmraid -ay". It said that the RAID0 was broken.

I rebooted and OROM said that the second raid member was offline, so
RAID0 failed and RAID1 degraded.
So just typing "dmraid -ay' makes OROM think one or more members are
offline!! I think only one got offline because the raid1 was degraded,
on the first time when both were online, after the "dmraid -ay" both
became offline.

Then I disconnected the sata port of one of the disks, and it said
raid0 and raid1 failed. Finally I reconnected all disks and both were
recognized, raid0 became online again.

So after all it was not bad luck or bad connectors! This is a big bug
in both dmraid or device-mapper, and OROM.

First of all, the metadata should not indicate that a member is
offline, that should be a temporarily disconnected member.

You can easily reproduce this by taking the following steps:

Get a P35 motherboard with ICH9R.
Create a RAID0 volume with two disks (in this case two Hitachi 7K160).
Create a RAID1 volume.
Boot OpenSuse 10.3RC2 livecd. Open a terminal, type "su" and then "dmraid -ay"
Reboot and see that at least one of the disks is an offline member,
and raid1 fails.


On 9/28/07, Fang, Ying <ying.fang@xxxxxxxxx> wrote:
> Sorry, Tiago. I misread your email regarding the two volumes: RAID0 and
> RAID0:1.
>
> In the following messages:
>
> "Port 1 .. Member disk(0,1)" means that the hard drive attached to port
> 0 (scsi address: 1:0:0:0) is a member of two RAID arrays (RAID0 and
> RAID1) which are defined as RAID id 0 and 1 respetively.
>
> But port 0(scsi address 0:0:0:0) has a hard drive that has a RAID
> configuration including the same names of the RAID arrays.
>
> Because the metadata got messed up, OROM couldn't determine that the
> above two hard drives were belong to the same group of disks. In order
> to differentiate the names of the RAID volumes from two hard drives, :1
> was added in.
>
> >>> >0  RAID0:1   80Gb       Failed
> >>> >1  RAID1:1  109.0Gb   Degraded
> >>> >2  RAID0     80Gb        Failed
> >>> >3  RAID1     109.0Gb   Degraded
> >>> >
> >>> >Port
> >>> >0     Hitachi        149.1GB     Member Disk(0,1)
> >>> >1     Hitachi        149.1GB     Member Disk(2,3)
>
> Thanks Eric for pointing out that the OROM display screen doesn't
> include the partition information.
>
> I hope that will help you understand those magic numbers. If you have
> any questions, let me know.
>
> Ying
> >-----Original Message-----
> >From: Fang, Ying
> >Sent: Thursday, September 27, 2007 4:07 PM
> >To: Tiago Freitas
> >Cc: ATARAID (eg, Promise Fasttrak, Highpoint 370) related discussions
> >Subject: RE: isw device for volume broken after opensuse livecd boot
> >
> >Are you talking about RAID1 and RAID1:1? The first is the RAID device
> and
> >the latter is the first partition in that RAID device. If you have more
> >than one partition there, you'll get RAID1:2 and so on.
> >
> >Ying
>

_______________________________________________
Ataraid-list mailing list
Ataraid-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ataraid-list

[Index of Archives]     [Linux RAID]     [Linux Device Mapper]     [Linux IDE]     [Linux SCSI]     [Kernel]     [Linux Books]     [Linux Admin]     [GFS]     [RPM]     [Yosemite Campgrounds]     [AMD 64]

  Powered by Linux