Re: RAID5 array showing as degraded after motherboard replacement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/11/06, dean gaudet <dean@xxxxxxxxxx> wrote:


On Mon, 6 Nov 2006, James Lee wrote:

> Thanks for the reply Dean.  I looked through dmesg output from the
> boot up, to check whether this was just an ordering issue during the
> system start up (since both evms and mdadm attempt to activate the
> array, which could cause things to go wrong...).
>
> Looking through the dmesg output though, it looks like the 'missing'
> disk is being detected before the array is assembled, but that the
> disk is throwing up errors.  I've attached the full output of dmesg;
> grepping it for "hde" gives the following:
>
> [17179574.084000]     ide2: BM-DMA at 0xd400-0xd407, BIOS settings:
> hde:DMA, hdf:DMA
> [17179574.380000] hde: NetCell SyncRAID(TM) SR5000 JBOD, ATA DISK drive
> [17179575.312000] hde: max request size: 512KiB
> [17179575.312000] hde: 625134827 sectors (320069 MB), CHS=38912/255/63, (U)DMA
> [17179575.312000] hde: set_geometry_intr: status=0x51 { DriveReady
> SeekComplete Error }
> [17179575.312000] hde: set_geometry_intr: error=0x04 { DriveStatusError }
> [17179575.312000] hde: cache flushes supported

is it possible that the "NetCell SyncRAID" implementation is stealing some
of the sectors (even though it's marked JBOD)?  anyhow it could be the
disk is bad, but i'd still be tempted to see if the problem stays with the
controller if you swap the disk with another in the array.

-dean


Looks like you might be right.  I removed one of the other drives from
the onboard controller, and moved the 'faulty' drive from the NetCell
controller to the onboard one.  Booted up up the machine, and the
drive is still not added to the array correctly (so the array now
fails to assemble, as there's only 3 out of 5 drives).  I've run the
Seagate diagnostics tools over the drive and they report successful
when it's connected to the onboard controller and unsuccessful when
it's connected to the NetCell controller (this may be a test tool
issue though).

I guess this indicates that either:
1) The NetCell controller is faulty and just not reading/writing data properly.
2) The NetCell controller's RAID implementation has somehow not been
transparent to the OS and has overwritten/modified md's superblocks.
3) EVMS somehow messed the config up on that drive when trying to
reassemble the array after the first time the controller came up.

I'll test for 1) by attaching another drive (not one of the ones in
the array!) to the NetCell contoller and seeing if it passed
diagnostics tests.  3) seems pretty unlikely.

I bought the NetCell card mainly for its Linux compatibility - do they
have known issues with mdadm?

Thanks,
James
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux