On Wed Oct 09, 2013 at 10:54:09AM +0200, Guillaume Betous wrote: > I don't know if /dev/sdb is still usable, or if this was only a > desynchro failure. > How to know ? > As sdb1 has already been marked spare, it'll need rebuilding anyway, so it doesn't really matter. If there's a real issue with it then it'll fail during the recovery process anyway. You can do a full read test on it (either a long SMART test, a simple dd from it, or a read-only badblocks test) if you want to check for issues though. > P.S. The /proc/mdstat file currently contains the following: > > Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] > md127 : active raid5 sde1[2] sdb1[5](F) sdc1[0](F) sdd1[6] sdf1[4] > 5860535808 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2] [__UU] > [============>........] recovery = 62.0% > (1212358192/1953511936) finish=854.7min speed=14451K/sec > It looks like recovery was kicked off onto sde1, but sdc has failed again during the rebuild. This would suggest a read error on sdc1 somewhere - dmesg should show some indication of what's happened. You'll need to stop the array and sort out sdc before you can get it going again. Use GNU ddrescue to image it onto another disk (preferably one that wasn't originally a member of the array) - it may be able to get all the data read (it tries somewhat harder than normal processes), or you'll at least see how much is unreadable. If it's all read okay then you can just re-run the force assembly using that disk instead of sdc (make sure you explicitly list the devices to use in the assembly command). Then add one of the other disks and wait for the rebuild to complete (there may be no real issue with sdc - you do sometimes get read errors on disks which are solved by simply rewriting the data). If not then you have to make a decision about whether there's few enough unreadable blocks to continue with assembly (as above) and possibly end up with some corrupt files, or whether you want to risk re-creating the array using the other original member (I'd suggest doing a full read test on that disk first though, as it may be in the same state). If you're wanting to do a re-create then we'll need to revisit your original array details to see parameters would be needed (and which mdadm version you'll need to get the correct data offsets). Once everything's back up and running, you really need to: - make sure the timeouts/ERC are set correctly at every boot - schedule array checks on a regular basis to pick up any read errors while they can still be corrected Cheers, Robin -- ___ ( ' } | Robin Hill <robin@xxxxxxxxxxxxxxx> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" |
Attachment:
signature.asc
Description: Digital signature