On 06/11/06, dean gaudet <dean@xxxxxxxxxx> wrote:
On Mon, 6 Nov 2006, James Lee wrote: > Thanks for the reply Dean. I looked through dmesg output from the > boot up, to check whether this was just an ordering issue during the > system start up (since both evms and mdadm attempt to activate the > array, which could cause things to go wrong...). > > Looking through the dmesg output though, it looks like the 'missing' > disk is being detected before the array is assembled, but that the > disk is throwing up errors. I've attached the full output of dmesg; > grepping it for "hde" gives the following: > > [17179574.084000] ide2: BM-DMA at 0xd400-0xd407, BIOS settings: > hde:DMA, hdf:DMA > [17179574.380000] hde: NetCell SyncRAID(TM) SR5000 JBOD, ATA DISK drive > [17179575.312000] hde: max request size: 512KiB > [17179575.312000] hde: 625134827 sectors (320069 MB), CHS=38912/255/63, (U)DMA > [17179575.312000] hde: set_geometry_intr: status=0x51 { DriveReady > SeekComplete Error } > [17179575.312000] hde: set_geometry_intr: error=0x04 { DriveStatusError } > [17179575.312000] hde: cache flushes supported is it possible that the "NetCell SyncRAID" implementation is stealing some of the sectors (even though it's marked JBOD)? anyhow it could be the disk is bad, but i'd still be tempted to see if the problem stays with the controller if you swap the disk with another in the array. -dean
Looks like you might be right. I removed one of the other drives from the onboard controller, and moved the 'faulty' drive from the NetCell controller to the onboard one. Booted up up the machine, and the drive is still not added to the array correctly (so the array now fails to assemble, as there's only 3 out of 5 drives). I've run the Seagate diagnostics tools over the drive and they report successful when it's connected to the onboard controller and unsuccessful when it's connected to the NetCell controller (this may be a test tool issue though). I guess this indicates that either: 1) The NetCell controller is faulty and just not reading/writing data properly. 2) The NetCell controller's RAID implementation has somehow not been transparent to the OS and has overwritten/modified md's superblocks. 3) EVMS somehow messed the config up on that drive when trying to reassemble the array after the first time the controller came up. I'll test for 1) by attaching another drive (not one of the ones in the array!) to the NetCell contoller and seeing if it passed diagnostics tests. 3) seems pretty unlikely. I bought the NetCell card mainly for its Linux compatibility - do they have known issues with mdadm? Thanks, James - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html