On Sat Sep 15, 2012 at 09:05:25 +0200, Niccolò Belli wrote: > CHECK didn't help me, so I did a echo "repair > > /sys/block/md0/md/sync_action". REPAIR didn't work too :( > Didn't work for what you were wanting anyway. It may well have worked for its intended purpose. > Here is syslog of REPAIR: > > Sep 15 19:34:10 asterisk mdadm[2117]: RebuildStarted event detected on > md device /dev/md/0 > Sep 15 19:34:10 asterisk kernel: [258470.152296] md: requested-resync of > RAID array md0 > Sep 15 19:34:10 asterisk kernel: [258470.152301] md: minimum > _guaranteed_ speed: 1000 KB/sec/disk. > Sep 15 19:34:10 asterisk kernel: [258470.152304] md: using maximum > available idle IO bandwidth (but not more than 200000 KB/sec) for > requested-resync. > Sep 15 19:34:10 asterisk kernel: [258470.152310] md: using 128k window, > over a total of 311619448k. > Sep 15 19:34:11 asterisk kernel: [258471.165653] ata3.00: exception > Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > Sep 15 19:34:11 asterisk kernel: [258471.167468] ata3.00: BMDMA stat 0x44 > Sep 15 19:34:11 asterisk kernel: [258471.169912] ata3.00: failed > command: READ DMA EXT > Sep 15 19:34:11 asterisk kernel: [258471.172769] ata3.00: cmd > 25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in > Sep 15 19:34:11 asterisk kernel: [258471.172771] res > 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error) > Sep 15 19:34:11 asterisk kernel: [258471.176753] ata3.00: status: { DRDY > ERR } > Sep 15 19:34:11 asterisk kernel: [258471.178605] ata3.00: error: { UNC } > Sep 15 19:34:12 asterisk kernel: [258472.148217] ata3.00: configured for > UDMA/133 > Sep 15 19:34:12 asterisk kernel: [258472.148232] ata3: EH complete > Sep 15 19:34:13 asterisk kernel: [258473.131054] ata3.00: exception > Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > Sep 15 19:34:13 asterisk kernel: [258473.132881] ata3.00: BMDMA stat 0x44 > Sep 15 19:34:13 asterisk kernel: [258473.134639] ata3.00: failed > command: READ DMA EXT > Sep 15 19:34:13 asterisk kernel: [258473.136413] ata3.00: cmd > 25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in > Sep 15 19:34:13 asterisk kernel: [258473.136415] res > 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error) > Sep 15 19:34:13 asterisk kernel: [258473.141768] ata3.00: status: { DRDY > ERR } > Sep 15 19:34:13 asterisk kernel: [258473.144049] ata3.00: error: { UNC } > Sep 15 19:34:14 asterisk kernel: [258474.112209] ata3.00: configured for > UDMA/133 > Sep 15 19:34:14 asterisk kernel: [258474.112224] ata3: EH complete > Sep 15 19:34:15 asterisk kernel: [258475.071642] ata3.00: exception > Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > Sep 15 19:34:15 asterisk kernel: [258475.073476] ata3.00: BMDMA stat 0x44 > Sep 15 19:34:15 asterisk kernel: [258475.075240] ata3.00: failed > command: READ DMA EXT > Sep 15 19:34:15 asterisk kernel: [258475.077027] ata3.00: cmd > 25/00:00:00:15:00/00:04:00:00:00/e0 tag 0 dma 524288 in > Sep 15 19:34:15 asterisk kernel: [258475.077029] res > 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error) > Sep 15 19:34:15 asterisk kernel: [258475.080720] ata3.00: status: { DRDY > ERR } > Sep 15 19:34:15 asterisk kernel: [258475.083512] ata3.00: error: { UNC } > Sep 15 19:34:16 asterisk kernel: [258476.100935] ata3.00: configured for > UDMA/133 > Sep 15 19:34:16 asterisk kernel: [258476.100960] ata3: EH complete > Sep 15 19:41:29 asterisk asterisk[3492]: rc_avpair_new: unknown > attribute 1490026597 > Sep 15 19:41:46 asterisk asterisk[3492]: rc_avpair_new: unknown > attribute 1490026597 > Sep 15 19:41:52 asterisk asterisk[3492]: rc_avpair_new: unknown > attribute 1490026597 > Sep 15 19:42:52 asterisk asterisk[3492]: rc_avpair_new: unknown > attribute 1490026597 > Sep 15 19:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 > Currently unreadable (pending) sectors > Sep 15 19:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline > uncorrectable sectors > Sep 15 19:50:51 asterisk mdadm[2117]: Rebuild26 event detected on md > device /dev/md/0 > Sep 15 20:07:31 asterisk mdadm[2117]: Rebuild53 event detected on md > device /dev/md/0 > Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 > Currently unreadable (pending) sectors > Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline > uncorrectable sectors > Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], > Temperature changed +4 Celsius to 42 Celsius (Min/Max 30/46) > Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sda [SAT], SMART > Usage Attribute: 201 Soft_Read_Error_Rate changed from 99 to 100 > Sep 15 20:16:34 asterisk smartd[2581]: Device: /dev/sdb [SAT], SMART > Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 60 > Sep 15 20:24:11 asterisk mdadm[2117]: Rebuild75 event detected on md > device /dev/md/0 > Sep 15 20:40:51 asterisk mdadm[2117]: Rebuild93 event detected on md > device /dev/md/0 > Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 2 > Currently unreadable (pending) sectors > Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], 1 Offline > uncorrectable sectors > Sep 15 20:46:34 asterisk smartd[2581]: Device: /dev/sda [SAT], SMART > Usage Attribute: 190 Airflow_Temperature_Cel changed from 61 to 60 > Sep 15 20:47:24 asterisk kernel: [262863.781068] md: md0: > requested-resync done. > Sep 15 20:47:24 asterisk mdadm[2117]: RebuildFinished event detected on > md device /dev/md/0 > > Okay, so the drive logs an exception at 19:34:11, then completes its error handling at 19:34:16. If md hasn't failed the drive then either: - md didn't get a read error - md got a success message when re-writing the block - there's a bug in md and it's not handled the error at all My guess would be on one of the first two (I'm not sure what's logged if md gets a read error and does a re-write). > > I still get: > > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Offline Completed: read failure 90% 8985 > 3912 > > and > > 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always > - 2 > 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age > Offline - 1 > > > How is it possible? Next thing I will try is manually failing /dev/sda > and filling it with zeros. I would like to do a *low level format* but I > didn't find the utility for my disk :( > I'm pretty sure there's no such thing as a *low level format* for any modern disk (or not one that does anything more than writing a known pattern to the disk). The low-level information is far too precisely laid out for the disk heads to be able to write. Writing zeros is certainly what I'd do in this situation - I've done it for several drives in the past where they've had offline uncorrectable sectors flagged. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@xxxxxxxxxxxxxxx> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" |
Attachment:
pgpaaULRDUsRS.pgp
Description: PGP signature