On Thu, 14 Aug 2014 18:08:30 -0500 Ram Ramesh <rramesh2400@xxxxxxxxx> wrote: > Hi, > > I just finished converting a 3-disk raid5 to 4-disk raid6. After a > reboot to start clean, I noticed that one of the disk (the new one I > just added) was missing in /proc/partitions. This was disk 4 in my > /dev/md0. Assuming some cable issue, I powered off, wiggled the cables > and restarted and the device was found by kernel. However, md0 shows > device missing and array degraded > > lata [rramesh] 280 > cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] > md0 : active raid6 sdb1[0] sdd1[3] sdc1[1] > 3906763776 blocks super 1.2 level 6, 512k chunk, algorithm 2 > [4/3] [UUU_] > > unused devices: <none> > > However my attempt to --re-add does not work. > > lata [rramesh] 277 > sudo mdadm /dev/md0 --verbose --re-add /dev/sde1 > mdadm: --re-add for /dev/sde1 to /dev/md0 is not possible "re-add" only makes sense when you have a write-indent bitmap which you don't have. So you need to "--add" which marks the device as a spare and then starts a complete rebuild. > I checked the SMART and it shows a lot of reallocated_sector_ct errors > also. So, the disk is dying, but I am not able understand why mdadm > would not add. It will "add". It just wont "re-add". NeilBrown > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000b 091 091 016 Pre-fail > Always - 53 > 2 Throughput_Performance 0x0005 100 100 054 Pre-fail > Offline - 0 > 3 Spin_Up_Time 0x0007 135 135 024 Pre-fail > Always - 426 (Average 425) > 4 Start_Stop_Count 0x0012 100 100 000 Old_age > Always - 59 > *5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail > Always FAILING_NOW 330* > 7 Seek_Error_Rate 0x000b 098 098 067 Pre-fail > Always - 2 > 8 Seek_Time_Performance 0x0005 100 100 020 Pre-fail > Offline - 0 > 9 Power_On_Hours 0x0012 100 100 000 Old_age > Always - 3445 > 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age > Always - 59 > 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age > Always - 548 > 193 Load_Cycle_Count 0x0012 100 100 000 Old_age > Always - 548 > 194 Temperature_Celsius 0x0002 153 153 000 Old_age > Always - 39 (Min/Max 21/43) > 196 Reallocated_Event_Count 0x0032 001 001 000 Old_age > Always - 17604 > 197 Current_Pending_Sector 0x0022 001 001 000 Old_age > Always - 13256 > 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age > Always - 0 > > Any recommendations while I am waiting to get a replacement. > > Ramesh > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature