RE: RAID6 array lost a disk, can someone decode the error?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



} -----Original Message-----
} From: Majed B. [mailto:majedb@xxxxxxxxx]
} Sent: Wednesday, November 11, 2009 12:52 AM
} To: Guy Watkins
} Cc: LinuxRaid
} Subject: Re: RAID6 array lost a disk, can someone decode the error?
} 
} You seem to have very high numbers in Hardware_ECC_Recovered and
} Raw_Read_Error_Rate. I suggest you replace your cables.

I thought I just did not understand those fields.  They seemed high/bad to
me too, but I do not have any other disks to compare to.  You think all 4
cables could be bad?  They are the same, but no idea what brand.

ok, any recommended vendor for new cables?
 
} You don't have bad sectors, which is good.
} 
} Are you using the controller for RAD or just as a way to connect your
} disks?

JBOD.  I did not know that controller had RAID.  :)

} I've had similar link-reset problems, but not written related. Turns
} out one of the disks had a bad PCB.
} 
} On Wed, Nov 11, 2009 at 8:37 AM, Guy Watkins <guy@xxxxxxxxxxxxxxxx> wrote:
} > I have 2 4-disk RAID6 arrays that loose a disk sometimes.  Maybe once
} every
} > month or 3.  As far as I can tell I don't have disks that have un-
} readable
} > blocks.  The RAID1 arrays also loose disks sometimes.  I have the 4
} disks on
} > 1 controller, from lspci:
} > 00:0e.0 Mass storage controller: Promise Technology, Inc. PDC20318
} (SATA150
} > TX4) (rev 02)
} >
} > I thought the RAID6 logic corrected single block errors?  Maybe not on a
} > write? And I think this is a write because of "super_written"?
} >
} > The array is a RAID6 but the errors say RAID5?
} >
} > When I remove and add the disks back in they rebuild just fine.
} >
} > Anyway, does anyone understand what this error really is?  Is it bad
} disks?
} > Bad cable?  Bad controller?  Bad sunspots?  :)
} >
} > I did see that a smart test had failed at about the same time.  I also
} read
} > that some disks or controllers can't handle smart tests.  Could that be
} it?
} > I don't run smart tests vary often, so I know the other failures from
} the
} > past were not caused by a smart test.  Maybe I am doing the tests wrong?
}  I
} > used this command: "smartctl --test=long /dev/sda"
} >
} > All info I think might be needed:
} >
} > The disks are all Seagate ST3320620AS (320 GB disks).
} >
} > # uname -a
} > Linux linux.watkins-home.com 2.6.27.35-170.2.94.fc10.i686 #1 SMP Thu Oct
} 1
} > 14:58:51 EDT 2009 i686 i686 i386 GNU/Linux
} >
} > # rpm -qa mdadm
} > mdadm-2.6.9-1.fc10.i386
} >
} > From /var/log/messages-20091108
} > Nov  1 21:48:29 linux kernel: ata4.00: exception Emask 0x10 SAct 0x0
} SErr
} > 0x180203 action 0x6 frozen
} > Nov  1 21:48:29 linux kernel: ata4: SError: { RecovData RecovComm
} Persist
} > 10B8B Dispar }
} > Nov  1 21:48:29 linux kernel: ata4.00: cmd
} > ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
} > Nov  1 21:48:29 linux kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00
} Emask
} > 0x14 (ATA bus error)
} > Nov  1 21:48:29 linux kernel: ata4.00: status: { DRDY }
} > Nov  1 21:48:29 linux kernel: ata4: hard resetting link
} > Nov  1 21:48:31 linux kernel: ata4: SATA link up 1.5 Gbps (SStatus 113
} > SControl 300)
} > Nov  1 21:48:31 linux kernel: ata4.00: configured for UDMA/133
} > Nov  1 21:48:31 linux kernel: ata4.00: device reported invalid CHS
} sector 0
} > Nov  1 21:48:31 linux kernel: ata4: EH complete
} > Nov  1 21:48:31 linux kernel: sd 3:0:0:0: [sdb] 625142448 512-byte
} hardware
} > sectors (320073 MB)
} > Nov  1 21:48:31 linux kernel: end_request: I/O error, dev sdb, sector
} > 34089705
} > Nov  1 21:48:31 linux kernel: md: super_written gets error=-5,
} uptodate=0
} > Nov  1 21:48:31 linux kernel: raid5: Disk failure on sdb2, disabling
} device.
} > Nov  1 21:48:31 linux kernel: raid5: Operation continuing on 3 devices.
} > Nov  1 21:48:31 linux kernel: sd 3:0:0:0: [sdb] Write Protect is off
} > Nov  1 21:48:31 linux kernel: sd 3:0:0:0: [sdb] Write cache: enabled,
} read
} > cache: enabled, doesn't support DPO or FUA
} > Nov  1 21:48:31 linux kernel: RAID5 conf printout:
} > Nov  1 21:48:31 linux kernel: --- rd:4 wd:3
} > Nov  1 21:48:31 linux kernel: disk 0, o:0, dev:sdb2
} > Nov  1 21:48:31 linux kernel: disk 1, o:1, dev:sdd2
} > Nov  1 21:48:31 linux kernel: disk 2, o:1, dev:sdc2
} > Nov  1 21:48:31 linux kernel: disk 3, o:1, dev:sda2
} > Nov  1 21:48:31 linux kernel: RAID5 conf printout:
} > Nov  1 21:48:31 linux kernel: --- rd:4 wd:3
} > Nov  1 21:48:31 linux kernel: disk 1, o:1, dev:sdd2
} > Nov  1 21:48:31 linux kernel: disk 2, o:1, dev:sdc2
} > Nov  1 21:48:31 linux kernel: disk 3, o:1, dev:sda2
} >
} > # cat /proc/mdstat
} > Personalities : [raid6] [raid5] [raid4] [raid1]
} > md0 : active raid1 sdd1[0] sda1[3] sdc1[2] sdb1[1]
} >      264960 blocks [4/4] [UUUU]
} >      bitmap: 0/33 pages [0KB], 4KB chunk
} >
} > md4 : active raid6 sdd4[0] sda4[3] sdc4[2] sdb4[4](F)
} >      586853888 blocks level 6, 256k chunk, algorithm 2 [4/3] [U_UU]
} >      bitmap: 70/140 pages [280KB], 1024KB chunk
} >
} > md2 : active raid1 sdb3[0] sda3[1]
} >      2096384 blocks [2/2] [UU]
} >      bitmap: 0/128 pages [0KB], 8KB chunk
} >
} > md1 : active raid1 sdd3[0] sdc3[1]
} >      2096384 blocks [2/2] [UU]
} >      bitmap: 0/128 pages [0KB], 8KB chunk
} >
} > md3 : active raid6 sdb2[4](F) sdd2[1] sda2[3] sdc2[2]
} >      33559552 blocks level 6, 256k chunk, algorithm 2 [4/3] [_UUU]
} >      bitmap: 119/129 pages [476KB], 64KB chunk
} >
} > unused devices: <none>
} >
} > # smartctl -a /dev/sda
} > smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce
} > Allen
} > Home page is http://smartmontools.sourceforge.net/
} >
} > === START OF INFORMATION SECTION ===
} > Model Family:     Seagate Barracuda 7200.10 family
} > Device Model:     ST3320620AS
} > Serial Number:    3QF08NDL
} > Firmware Version: 3.AAD
} > User Capacity:    320,072,933,376 bytes
} > Device is:        In smartctl database [for details use: -P show]
} > ATA Version is:   7
} > ATA Standard is:  Exact ATA specification draft version not indicated
} > Local Time is:    Tue Nov 10 23:57:28 2009 EST
} > SMART support is: Available - device has SMART capability.
} > SMART support is: Enabled
} >
} > === START OF READ SMART DATA SECTION ===
} > SMART overall-health self-assessment test result: PASSED
} >
} > General SMART Values:
} > Offline data collection status:  (0x82) Offline data collection activity
} >                                        was completed without error.
} >                                        Auto Offline Data Collection:
} > Enabled.
} > Self-test execution status:      (   0) The previous self-test routine
} > completed
} >                                        without error or no self-test has
} > ever
} >                                        been run.
} > Total time to complete Offline
} > data collection:                 ( 430) seconds.
} > Offline data collection
} > capabilities:                    (0x5b) SMART execute Offline immediate.
} >                                        Auto Offline data collection
} on/off
} > support.
} >                                        Suspend Offline collection upon
} new
} >                                        command.
} >                                        Offline surface scan supported.
} >                                        Self-test supported.
} >                                        No Conveyance Self-test
} supported.
} >                                        Selective Self-test supported.
} > SMART capabilities:            (0x0003) Saves SMART data before entering
} >                                        power-saving mode.
} >                                        Supports SMART auto save timer.
} > Error logging capability:        (0x01) Error logging supported.
} >                                        General Purpose Logging
} supported.
} > Short self-test routine
} > recommended polling time:        (   1) minutes.
} > Extended self-test routine
} > recommended polling time:        ( 115) minutes.
} >
} > SMART Attributes Data Structure revision number: 10
} > Vendor Specific SMART Attributes with Thresholds:
} > ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
}  UPDATED
} > WHEN_FAILED RAW_VALUE
} >  1 Raw_Read_Error_Rate     0x000f   114   097   006    Pre-fail  Always
} > -       77830969
} >  3 Spin_Up_Time            0x0003   094   090   000    Pre-fail  Always
} > -       0
} >  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always
} > -       83
} >  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always
} > -       0
} >  7 Seek_Error_Rate         0x000f   081   060   030    Pre-fail  Always
} > -       150227385
} >  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always
} > -       23919
} >  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always
} > -       0
} >  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always
} > -       116
} > 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always
} > -       0
} > 189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always
} > -       0
} > 190 Airflow_Temperature_Cel 0x0022   061   046   045    Old_age   Always
} > -       39 (Lifetime Min/Max 37/43)
} > 194 Temperature_Celsius     0x0022   039   054   000    Old_age   Always
} > -       39 (0 21 0 0)
} > 195 Hardware_ECC_Recovered  0x001a   065   054   000    Old_age   Always
} > -       102168431
} > 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
} > -       0
} > 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
} Offline
} > -       0
} > 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
} > -       0
} > 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
} Offline
} > -       0
} > 202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always
} > -       0
} >
} > SMART Error Log Version: 1
} > No Errors Logged
} >
} > SMART Self-test log structure revision number 1
} > Num  Test_Description    Status                  Remaining
}  LifeTime(hours)
} > LBA_of_first_error
} > # 1  Extended offline    Completed without error       00%     23730
} > -
} > # 2  Extended offline    Completed without error       00%     22581
} > -
} > # 3  Short offline       Completed without error       00%     22577
} > -
} > # 4  Extended offline    Completed without error       00%     17267
} > -
} > # 5  Short offline       Completed without error       00%     17259
} > -
} > # 6  Extended offline    Completed without error       00%       384
} > -
} >
} > SMART Selective self-test log data structure revision number 1
} >  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
} >    1        0        0  Not_testing
} >    2        0        0  Not_testing
} >    3        0        0  Not_testing
} >    4        0        0  Not_testing
} >    5        0        0  Not_testing
} > Selective self-test flags (0x0):
} >  After scanning selected spans, do NOT read-scan remainder of disk.
} > If Selective self-test is pending on power-up, resume after 0 minute
} delay.
} >
} > # smartctl -a /dev/sdb
} > smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce
} > Allen
} > Home page is http://smartmontools.sourceforge.net/
} >
} > === START OF INFORMATION SECTION ===
} > Model Family:     Seagate Barracuda 7200.10 family
} > Device Model:     ST3320620AS
} > Serial Number:    3QF08SKR
} > Firmware Version: 3.AAD
} > User Capacity:    320,072,933,376 bytes
} > Device is:        In smartctl database [for details use: -P show]
} > ATA Version is:   7
} > ATA Standard is:  Exact ATA specification draft version not indicated
} > Local Time is:    Wed Nov 11 00:03:14 2009 EST
} > SMART support is: Available - device has SMART capability.
} > SMART support is: Enabled
} >
} > === START OF READ SMART DATA SECTION ===
} > SMART overall-health self-assessment test result: PASSED
} >
} > General SMART Values:
} > Offline data collection status:  (0x82) Offline data collection activity
} >                                        was completed without error.
} >                                        Auto Offline Data Collection:
} > Enabled.
} > Self-test execution status:      (  37) The self-test routine was
} > interrupted
} >                                        by the host with a hard or soft
} > reset.
} > Total time to complete Offline
} > data collection:                 ( 430) seconds.
} > Offline data collection
} > capabilities:                    (0x5b) SMART execute Offline immediate.
} >                                        Auto Offline data collection
} on/off
} > support.
} >                                        Suspend Offline collection upon
} new
} >                                        command.
} >                                        Offline surface scan supported.
} >                                        Self-test supported.
} >                                        No Conveyance Self-test
} supported.
} >                                        Selective Self-test supported.
} > SMART capabilities:            (0x0003) Saves SMART data before entering
} >                                        power-saving mode.
} >                                        Supports SMART auto save timer.
} > Error logging capability:        (0x01) Error logging supported.
} >                                        General Purpose Logging
} supported.
} > Short self-test routine
} > recommended polling time:        (   1) minutes.
} > Extended self-test routine
} > recommended polling time:        ( 115) minutes.
} >
} > SMART Attributes Data Structure revision number: 10
} > Vendor Specific SMART Attributes with Thresholds:
} > ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
}  UPDATED
} > WHEN_FAILED RAW_VALUE
} >  1 Raw_Read_Error_Rate     0x000f   111   091   006    Pre-fail  Always
} > -       136981744
} >  3 Spin_Up_Time            0x0003   099   090   000    Pre-fail  Always
} > -       0
} >  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always
} > -       104
} >  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always
} > -       1
} >  7 Seek_Error_Rate         0x000f   084   060   030    Pre-fail  Always
} > -       257877357
} >  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always
} > -       23916
} >  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always
} > -       0
} >  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always
} > -       157
} > 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always
} > -       0
} > 189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always
} > -       0
} > 190 Airflow_Temperature_Cel 0x0022   059   049   045    Old_age   Always
} > -       41 (Lifetime Min/Max 38/43)
} > 194 Temperature_Celsius     0x0022   041   051   000    Old_age   Always
} > -       41 (0 21 0 0)
} > 195 Hardware_ECC_Recovered  0x001a   063   054   000    Old_age   Always
} > -       160751697
} > 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
} > -       0
} > 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
} Offline
} > -       0
} > 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
} > -       0
} > 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
} Offline
} > -       0
} > 202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always
} > -       0
} >
} > SMART Error Log Version: 1
} > No Errors Logged
} >
} > SMART Self-test log structure revision number 1
} > Num  Test_Description    Status                  Remaining
}  LifeTime(hours)
} > LBA_of_first_error
} > # 1  Extended offline    Interrupted (host reset)      50%     23726
} > -
} > # 2  Extended offline    Completed without error       00%     22580
} > -
} > # 3  Short offline       Completed without error       00%     22577
} > -
} > # 4  Extended offline    Completed without error       00%     17267
} > -
} > # 5  Short offline       Completed without error       00%     17260
} > -
} > # 6  Extended offline    Completed without error       00%       384
} > -
} >
} > SMART Selective self-test log data structure revision number 1
} >  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
} >    1        0        0  Not_testing
} >    2        0        0  Not_testing
} >    3        0        0  Not_testing
} >    4        0        0  Not_testing
} >    5        0        0  Not_testing
} > Selective self-test flags (0x0):
} >  After scanning selected spans, do NOT read-scan remainder of disk.
} > If Selective self-test is pending on power-up, resume after 0 minute
} delay.
} >
} > # smartctl -a /dev/sdc
} > smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce
} > Allen
} > Home page is http://smartmontools.sourceforge.net/
} >
} > === START OF INFORMATION SECTION ===
} > Model Family:     Seagate Barracuda 7200.10 family
} > Device Model:     ST3320620AS
} > Serial Number:    3QF08V24
} > Firmware Version: 3.AAD
} > User Capacity:    320,072,933,376 bytes
} > Device is:        In smartctl database [for details use: -P show]
} > ATA Version is:   7
} > ATA Standard is:  Exact ATA specification draft version not indicated
} > Local Time is:    Wed Nov 11 00:03:36 2009 EST
} > SMART support is: Available - device has SMART capability.
} > SMART support is: Enabled
} >
} > === START OF READ SMART DATA SECTION ===
} > SMART overall-health self-assessment test result: PASSED
} > See vendor-specific Attribute list for marginal Attributes.
} >
} > General SMART Values:
} > Offline data collection status:  (0x82) Offline data collection activity
} >                                        was completed without error.
} >                                        Auto Offline Data Collection:
} > Enabled.
} > Self-test execution status:      (   0) The previous self-test routine
} > completed
} >                                        without error or no self-test has
} > ever
} >                                        been run.
} > Total time to complete Offline
} > data collection:                 ( 430) seconds.
} > Offline data collection
} > capabilities:                    (0x5b) SMART execute Offline immediate.
} >                                        Auto Offline data collection
} on/off
} > support.
} >                                        Suspend Offline collection upon
} new
} >                                        command.
} >                                        Offline surface scan supported.
} >                                        Self-test supported.
} >                                        No Conveyance Self-test
} supported.
} >                                        Selective Self-test supported.
} > SMART capabilities:            (0x0003) Saves SMART data before entering
} >                                        power-saving mode.
} >                                        Supports SMART auto save timer.
} > Error logging capability:        (0x01) Error logging supported.
} >                                        General Purpose Logging
} supported.
} > Short self-test routine
} > recommended polling time:        (   1) minutes.
} > Extended self-test routine
} > recommended polling time:        ( 115) minutes.
} >
} > SMART Attributes Data Structure revision number: 10
} > Vendor Specific SMART Attributes with Thresholds:
} > ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
}  UPDATED
} > WHEN_FAILED RAW_VALUE
} >  1 Raw_Read_Error_Rate     0x000f   119   090   006    Pre-fail  Always
} > -       221110249
} >  3 Spin_Up_Time            0x0003   094   090   000    Pre-fail  Always
} > -       0
} >  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always
} > -       94
} >  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always
} > -       0
} >  7 Seek_Error_Rate         0x000f   081   060   030    Pre-fail  Always
} > -       138219006
} >  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always
} > -       23917
} >  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always
} > -       0
} >  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always
} > -       130
} > 187 Reported_Uncorrect      0x0032   082   082   000    Old_age   Always
} > -       18
} > 189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always
} > -       0
} > 190 Airflow_Temperature_Cel 0x0022   059   044   045    Old_age   Always
} > In_the_past 41 (Lifetime Min/Max 39/45)
} > 194 Temperature_Celsius     0x0022   041   056   000    Old_age   Always
} > -       41 (0 22 0 0)
} > 195 Hardware_ECC_Recovered  0x001a   066   057   000    Old_age   Always
} > -       145841009
} > 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
} > -       0
} > 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
} Offline
} > -       0
} > 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
} > -       0
} > 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
} Offline
} > -       0
} > 202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always
} > -       0
} >
} > SMART Error Log Version: 1
} > ATA Error Count: 18 (device log contains only the most recent five
} errors)
} >        CR = Command Register [HEX]
} >        FR = Features Register [HEX]
} >        SC = Sector Count Register [HEX]
} >        SN = Sector Number Register [HEX]
} >        CL = Cylinder Low Register [HEX]
} >        CH = Cylinder High Register [HEX]
} >        DH = Device/Head Register [HEX]
} >        DC = Device Command Register [HEX]
} >        ER = Error register [HEX]
} >        ST = Status register [HEX]
} > Powered_Up_Time is measured from power on, and printed as
} > DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
} > SS=sec, and sss=millisec. It "wraps" after 49.710 days.
} >
} > Error 18 occurred at disk power-on lifetime: 5380 hours (224 days + 4
} hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 63 81 09 e0  Error: UNC at LBA = 0x00098163 = 622947
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 e1 7e 09 e0 00      00:16:26.026  READ DMA EXT
} >  ec 00 00 00 00 00 a0 00      00:16:26.022  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:16:26.022  SET FEATURES [Set transfer
} > mode]
} >  ec 00 00 00 00 00 a0 00      00:16:26.019  IDENTIFY DEVICE
} >  25 00 00 e1 7e 09 e0 00      00:16:24.456  READ DMA EXT
} >
} > Error 17 occurred at disk power-on lifetime: 5380 hours (224 days + 4
} hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 63 81 09 e0  Error: UNC at LBA = 0x00098163 = 622947
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 e1 7e 09 e0 00      00:16:21.313  READ DMA EXT
} >  ec 00 00 00 00 00 a0 00      00:16:19.753  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:16:19.749  SET FEATURES [Set transfer
} > mode]
} >  ec 00 00 00 00 00 a0 00      00:16:19.749  IDENTIFY DEVICE
} >  25 00 00 e1 7e 09 e0 00      00:16:24.456  READ DMA EXT
} >
} > Error 16 occurred at disk power-on lifetime: 5380 hours (224 days + 4
} hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 63 81 09 e0  Error: UNC at LBA = 0x00098163 = 622947
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 e1 7e 09 e0 00      00:16:21.313  READ DMA EXT
} >  ec 00 00 00 00 00 a0 00      00:16:19.753  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:16:19.749  SET FEATURES [Set transfer
} > mode]
} >  ec 00 00 00 00 00 a0 00      00:16:19.749  IDENTIFY DEVICE
} >  25 00 00 e1 7e 09 e0 00      00:16:19.745  READ DMA EXT
} >
} > Error 15 occurred at disk power-on lifetime: 5380 hours (224 days + 4
} hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 63 81 09 e0  Error: UNC at LBA = 0x00098163 = 622947
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 e1 7e 09 e0 00      00:16:21.313  READ DMA EXT
} >  ec 00 00 00 00 00 a0 00      00:16:19.753  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:16:19.749  SET FEATURES [Set transfer
} > mode]
} >  ec 00 00 00 00 00 a0 00      00:16:19.749  IDENTIFY DEVICE
} >  25 00 00 e1 7e 09 e0 00      00:16:19.745  READ DMA EXT
} >
} > Error 14 occurred at disk power-on lifetime: 5380 hours (224 days + 4
} hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 63 81 09 e0  Error: UNC at LBA = 0x00098163 = 622947
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 e1 7e 09 e0 00      00:16:17.672  READ DMA EXT
} >  ec 00 00 00 00 00 a0 00      00:16:19.753  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:16:19.749  SET FEATURES [Set transfer
} > mode]
} >  ec 00 00 00 00 00 a0 00      00:16:19.749  IDENTIFY DEVICE
} >  25 00 00 e1 7e 09 e0 00      00:16:19.745  READ DMA EXT
} >
} > SMART Self-test log structure revision number 1
} > Num  Test_Description    Status                  Remaining
}  LifeTime(hours)
} > LBA_of_first_error
} > # 1  Extended offline    Completed without error       00%     23728
} > -
} > # 2  Extended offline    Completed without error       00%     22579
} > -
} > # 3  Short offline       Completed without error       00%     22576
} > -
} > # 4  Extended offline    Completed without error       00%     17265
} > -
} > # 5  Short offline       Completed without error       00%     17257
} > -
} > # 6  Extended offline    Completed without error       00%       384
} > -
} >
} > SMART Selective self-test log data structure revision number 1
} >  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
} >    1        0        0  Not_testing
} >    2        0        0  Not_testing
} >    3        0        0  Not_testing
} >    4        0        0  Not_testing
} >    5        0        0  Not_testing
} > Selective self-test flags (0x0):
} >  After scanning selected spans, do NOT read-scan remainder of disk.
} > If Selective self-test is pending on power-up, resume after 0 minute
} delay.
} >
} > # smartctl -a /dev/sdd
} > smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce
} > Allen
} > Home page is http://smartmontools.sourceforge.net/
} >
} > === START OF INFORMATION SECTION ===
} > Model Family:     Seagate Barracuda 7200.10 family
} > Device Model:     ST3320620AS
} > Serial Number:    3QF08WDP
} > Firmware Version: 3.AAD
} > User Capacity:    320,072,933,376 bytes
} > Device is:        In smartctl database [for details use: -P show]
} > ATA Version is:   7
} > ATA Standard is:  Exact ATA specification draft version not indicated
} > Local Time is:    Wed Nov 11 00:04:04 2009 EST
} > SMART support is: Available - device has SMART capability.
} > SMART support is: Enabled
} >
} > === START OF READ SMART DATA SECTION ===
} > SMART overall-health self-assessment test result: PASSED
} >
} > General SMART Values:
} > Offline data collection status:  (0x82) Offline data collection activity
} >                                        was completed without error.
} >                                        Auto Offline Data Collection:
} > Enabled.
} > Self-test execution status:      (   0) The previous self-test routine
} > completed
} >                                        without error or no self-test has
} > ever
} >                                        been run.
} > Total time to complete Offline
} > data collection:                 ( 430) seconds.
} > Offline data collection
} > capabilities:                    (0x5b) SMART execute Offline immediate.
} >                                        Auto Offline data collection
} on/off
} > support.
} >                                        Suspend Offline collection upon
} new
} >                                        command.
} >                                        Offline surface scan supported.
} >                                        Self-test supported.
} >                                        No Conveyance Self-test
} supported.
} >                                        Selective Self-test supported.
} > SMART capabilities:            (0x0003) Saves SMART data before entering
} >                                        power-saving mode.
} >                                        Supports SMART auto save timer.
} > Error logging capability:        (0x01) Error logging supported.
} >                                        General Purpose Logging
} supported.
} > Short self-test routine
} > recommended polling time:        (   1) minutes.
} > Extended self-test routine
} > recommended polling time:        ( 115) minutes.
} >
} > SMART Attributes Data Structure revision number: 10
} > Vendor Specific SMART Attributes with Thresholds:
} > ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
}  UPDATED
} > WHEN_FAILED RAW_VALUE
} >  1 Raw_Read_Error_Rate     0x000f   110   090   006    Pre-fail  Always
} > -       25809154
} >  3 Spin_Up_Time            0x0003   098   090   000    Pre-fail  Always
} > -       0
} >  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always
} > -       516
} >  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always
} > -       0
} >  7 Seek_Error_Rate         0x000f   082   060   030    Pre-fail  Always
} > -       192909989
} >  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always
} > -       23896
} >  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always
} > -       0
} >  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always
} > -       777
} > 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always
} > -       0
} > 189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always
} > -       0
} > 190 Airflow_Temperature_Cel 0x0022   061   050   045    Old_age   Always
} > -       39 (Lifetime Min/Max 36/42)
} > 194 Temperature_Celsius     0x0022   039   050   000    Old_age   Always
} > -       39 (0 20 0 0)
} > 195 Hardware_ECC_Recovered  0x001a   064   055   000    Old_age   Always
} > -       81546876
} > 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
} > -       0
} > 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
} Offline
} > -       0
} > 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
} > -       0
} > 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
} Offline
} > -       0
} > 202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always
} > -       0
} >
} > SMART Error Log Version: 1
} > ATA Error Count: 6 (device log contains only the most recent five
} errors)
} >        CR = Command Register [HEX]
} >        FR = Features Register [HEX]
} >        SC = Sector Count Register [HEX]
} >        SN = Sector Number Register [HEX]
} >        CL = Cylinder Low Register [HEX]
} >        CH = Cylinder High Register [HEX]
} >        DH = Device/Head Register [HEX]
} >        DC = Device Command Register [HEX]
} >        ER = Error register [HEX]
} >        ST = Status register [HEX]
} > Powered_Up_Time is measured from power on, and printed as
} > DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
} > SS=sec, and sss=millisec. It "wraps" after 49.710 days.
} >
} > Error 6 occurred at disk power-on lifetime: 10007 hours (416 days + 23
} > hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 a5 0d 4a e0  Error: UNC at LBA = 0x004a0da5 = 4853157
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 8f 0c 4a e0 00      00:05:45.657  READ DMA EXT
} >  27 00 00 00 00 00 e0 00      00:05:45.654  READ NATIVE MAX ADDRESS EXT
} >  ec 00 00 00 00 00 a0 00      00:05:43.727  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:05:43.660  SET FEATURES [Set transfer
} > mode]
} >  27 00 00 00 00 00 e0 00      00:05:43.658  READ NATIVE MAX ADDRESS EXT
} >
} > Error 5 occurred at disk power-on lifetime: 10007 hours (416 days + 23
} > hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 a5 0d 4a e0  Error: UNC at LBA = 0x004a0da5 = 4853157
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 8f 0c 4a e0 00      00:05:45.657  READ DMA EXT
} >  27 00 00 00 00 00 e0 00      00:05:45.654  READ NATIVE MAX ADDRESS EXT
} >  ec 00 00 00 00 00 a0 00      00:05:43.727  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:05:43.660  SET FEATURES [Set transfer
} > mode]
} >  27 00 00 00 00 00 e0 00      00:05:43.658  READ NATIVE MAX ADDRESS EXT
} >
} > Error 4 occurred at disk power-on lifetime: 10007 hours (416 days + 23
} > hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 a5 0d 4a e0  Error: UNC at LBA = 0x004a0da5 = 4853157
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 8f 0c 4a e0 00      00:05:39.547  READ DMA EXT
} >  27 00 00 00 00 00 e0 00      00:05:39.544  READ NATIVE MAX ADDRESS EXT
} >  ec 00 00 00 00 00 a0 00      00:05:43.727  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:05:43.660  SET FEATURES [Set transfer
} > mode]
} >  27 00 00 00 00 00 e0 00      00:05:43.658  READ NATIVE MAX ADDRESS EXT
} >
} > Error 3 occurred at disk power-on lifetime: 10007 hours (416 days + 23
} > hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 a5 0d 4a e0  Error: UNC at LBA = 0x004a0da5 = 4853157
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 8f 0c 4a e0 00      00:05:39.547  READ DMA EXT
} >  27 00 00 00 00 00 e0 00      00:05:39.544  READ NATIVE MAX ADDRESS EXT
} >  ec 00 00 00 00 00 a0 00      00:05:39.530  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:05:39.475  SET FEATURES [Set transfer
} > mode]
} >  27 00 00 00 00 00 e0 00      00:05:39.472  READ NATIVE MAX ADDRESS EXT
} >
} > Error 2 occurred at disk power-on lifetime: 10007 hours (416 days + 23
} > hours)
} >  When the command that caused the error occurred, the device was active
} or
} > idle.
} >
} >  After command completion occurred, registers were:
} >  ER ST SC SN CL CH DH
} >  -- -- -- -- -- -- --
} >  40 51 00 a5 0d 4a e0  Error: UNC at LBA = 0x004a0da5 = 4853157
} >
} >  Commands leading to the command that caused the error were:
} >  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
} >  -- -- -- -- -- -- -- --  ----------------  --------------------
} >  25 00 00 8f 0c 4a e0 00      00:05:39.547  READ DMA EXT
} >  27 00 00 00 00 00 e0 00      00:05:39.544  READ NATIVE MAX ADDRESS EXT
} >  ec 00 00 00 00 00 a0 00      00:05:39.530  IDENTIFY DEVICE
} >  ef 03 46 00 00 00 a0 00      00:05:39.475  SET FEATURES [Set transfer
} > mode]
} >  27 00 00 00 00 00 e0 00      00:05:39.472  READ NATIVE MAX ADDRESS EXT
} >
} > SMART Self-test log structure revision number 1
} > Num  Test_Description    Status                  Remaining
}  LifeTime(hours)
} > LBA_of_first_error
} > # 1  Extended offline    Completed without error       00%     23707
} > -
} > # 2  Extended offline    Completed without error       00%     22559
} > -
} > # 3  Short offline       Completed without error       00%     22555
} > -
} > # 4  Extended offline    Completed without error       00%     17248
} > -
} > # 5  Short offline       Completed without error       00%     17241
} > -
} > # 6  Short offline       Completed without error       00%     17241
} > -
} > # 7  Extended offline    Completed without error       00%       384
} > -
} > # 8  Short offline       Completed without error       00%       381
} > -
} >
} > SMART Selective self-test log data structure revision number 1
} >  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
} >    1        0        0  Not_testing
} >    2        0        0  Not_testing
} >    3        0        0  Not_testing
} >    4        0        0  Not_testing
} >    5        0        0  Not_testing
} > Selective self-test flags (0x0):
} >  After scanning selected spans, do NOT read-scan remainder of disk.
} > If Selective self-test is pending on power-up, resume after 0 minute
} delay.
} >
} > Thanks,
} > Guy
} >
} > --
} > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
} > the body of a message to majordomo@xxxxxxxxxxxxxxx
} > More majordomo info at  http://vger.kernel.org/majordomo-info.html
} >
} 
} 
} 
} --
}        Majed B.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux