Re: Disk near failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Thu, 27 Oct 2016 11:25, Alessandro Baggi wrote:
Il 24/10/2016 14:05, Leonard den Ottolander ha scritto:
 On Mon, 2016-10-24 at 12:07 +0200, Alessandro Baggi wrote:
>  === START OF READ SMART DATA SECTION ===
>  SMART Error Log not supported

 I reckon there's a <snip> between those lines. The line right after the
 first should read something like:

 SMART overall-health self-assessment test result: PASSED

 or "FAILED" for that matter. If not try running

 smartctl -t short /dev/sda

 , wait for the indicated time to expire, then check the output of
 smartctl -a (or -x) again.

 Regards,
 Leonard.

Hi Leonard,
after a smart short test, the output of smartctl -a /dev/... is

=== START OF INFORMATION SECTION ===
Model Family:     SandForce Driven SSDs
Device Model:     Corsair Force GT
Serial Number:    12297948000015020A81
LU WWN Device Id: 0 000000 000000000
Firmware Version: 5.02
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS, ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Oct 27 11:22:22 2016 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
                                        was completed without error.
Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed
                                       without error or no self-test has ever
                                       been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                       General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  48) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x0021) SCT Status supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x000f   120   120   050    Pre-fail Always  -  0/0
 5 Retired_Block_Count     0x0033   100   100   003    Pre-fail Always  -  0
 9 Power_On_Hours_and_Msec 0x0032   000   000   000    Old_age  Always  -  17394h+07m+56.840s
12 Power_Cycle_Count       0x0032   099   099   000    Old_age  Always  -  1974
171 Program_Fail_Count     0x0032   000   000   000    Old_age  Always  -  0
172 Erase_Fail_Count       0x0032   000   000   000    Old_age  Always  -  0
174 Unexpect_Power_Loss_Ct 0x0030   000   000   000    Old_age  Offline -  780
177 Wear_Range_Delta       0x0000   000   000   000    Old_age  Offline -  3
181 Program_Fail_Count     0x0032   000   000   000    Old_age  Always  -  0
182 Erase_Fail_Count       0x0032   000   000   000    Old_age  Always  -  0
187 Reported_Uncorrect     0x0032   100   100   000    Old_age  Always  -  0
194 Temperature_Celsius    0x0022   029   042   000    Old_age  Always  -  29 (Min/Max 15/42)
195 ECC_Uncorr_Error_Count 0x001c   100   100   000    Old_age  Offline -  0/0
196 Reallocated_Event_Ct   0x0033   100   100   003    Pre-fail Always  -  0
201 Unc_Soft_Read_Err_Rate 0x001c   100   100   000    Old_age  Offline -  0/0
204 Soft_ECC_Correct_Rate  0x001c   100   100   000    Old_age  Offline -  0/0
230 Life_Curve_Status      0x0013   100   100   000    Pre-fail Always  -  100
231 SSD_Life_Left          0x0013   100   100   010    Pre-fail Always  -  0
233 SandForce_Internal     0x0000   000   000   000    Old_age  Offline -  6599
234 SandForce_Internal     0x0032   000   000   000    Old_age  Always  -  6894
241 Lifetime_Writes_GiB    0x0032   000   000   000    Old_age  Always  -  6894
242 Lifetime_Reads_GiB     0x0032   000   000   000    Old_age  Always  -  6326

SMART Error Log not supported

SMART Self-test Log not supported

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Hmm, lets do some math:
17394 hours "on"-time equals 724.7 days (at continous "on").
6894 GiB written at 120 GiB drive sizes gives 57.4 Drive-Writes
(at optimal wearleveling every cell would have been written 57-58 times)

The used Sandforce controller (likly a SF-2281) is not the best at
wearleveling, so the  "use"-count per cell will be most likely more
than double that.

For my personal use I would replace that Drive asap.
- There is no warranty for it anymore (time since buy)
- You can't buy it new anymore (discontinued)
- There are more reliable drives available.

I'd go for a Samsung Evo 850, that will give you five years of warranty.

But, it's your drive, you make the decissions.

 - Yamaban.
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
https://lists.centos.org/mailman/listinfo/centos



[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux