Drive failure during SMART test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Help wanted very much!

My setup:
Thecus N5550 NAS with 5 1TB drives installed.

MD0: RAID 5 config of 4 drives (SD[ABCD]2)
MD10: RAID 1 config of all 5 drives (SD..1), system generated array
MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array

1 drive (SDE) set as global hot spare.


What happened:
This weekend I thought it might be a good idea to do a SMART test for
the drives in my NAS.
I started the test on 1 drive and after it ran for a while I started
the other ones.
While the test was running drive 3 failed. I got a message the RAID
was degraded and started rebuilding. (My assumption is that at this
moment the global hot spare will automatically be added to the array)

I stopped the SMART tests of all drives at this moment since it seemed
logical to me the SMART test (or the outcomes) made the drive fail.
In stopping the tests, drive 1 also failed!!
I let it for a little but the admin interface kept telling me it was
degraded, did not seem to take any actions to start rebuilding.
At this point I started googling and found I should remove and reseat
the drives. This is also what I did but nothing seemd to happen.
The turned up as new drives in the admin interface and I re-added them
to the array, they were added as spares.
Even after adding them the array didn't start rebuilding.
I checked stat in mdadm and it told me clean FAILED opposed to the
degraded in the admin interface.

I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
after rebooting it seemed as if the entire array had disappeared!!
I started looking for options in MDADM and tried every "normal"option
to rebuild the array (--assemble --scan for example)
Unfortunately I cannot produce a complete list since I cannot find how
to get it from the logging.

Finally I mdadm --create a new array with the original 4 drives with
all the right settings. (Got them from 1 of the original volumes)
The creation worked but after creation it doesn't seem to have a valid
partition table. This is the point where I realized I probably fucked
it up big-time and should call in the help squad!!!
What I think went wrong is that I re-created an array with the
original 4 drives from before the first failure but the hot-spare was
already added?

The most important data from the array is saved in an offline backup
luckily but I would very much like it if there is any way I could
restore the data from the array.

Is there any way I could get it back online?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux