Christian, can you post your values for Power_Loss_Cap_Test on the drive which is failing? Thanks Jan > On 03 Aug 2016, at 13:33, Christian Balzer <chibi@xxxxxxx> wrote: > > > Hello, > > yeah, I was particular interested in the Power_Loss_Cap_Test bit, as it > seemed to be such an odd thing to fail (given that's not single capacitor). > > As for your Reallocated_Sector_Ct, that's really odd and definitely a RMA > worthy issue. > > For the record, Intel SSDs use (typically 24) sectors when doing firmware > upgrades, so this is a totally healthy 3610. ^o^ > --- > 5 Reallocated_Sector_Ct 0x0032 099 099 000 Old_age Always - 24 > --- > > Christian > > On Wed, 3 Aug 2016 13:12:53 +0200 Daniel Swarbrick wrote: > >> Right, I actually updated to smartmontools 6.5+svn4324, which now >> properly supports this drive model. Some of the smart attr names have >> changed, and make more sense now (and there are no more "Unknowns"): >> >> ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE >> 5 Reallocated_Sector_Ct -O--CK 081 081 000 - 944 >> 9 Power_On_Hours -O--CK 100 100 000 - 1067 >> 12 Power_Cycle_Count -O--CK 100 100 000 - 7 >> 170 Available_Reservd_Space PO--CK 085 085 010 - 0 >> 171 Program_Fail_Count -O--CK 100 100 000 - 0 >> 172 Erase_Fail_Count -O--CK 100 100 000 - 68 >> 174 Unsafe_Shutdown_Count -O--CK 100 100 000 - 6 >> 175 Power_Loss_Cap_Test PO--CK 100 100 010 - 6510 (4 4307) >> 183 SATA_Downshift_Count -O--CK 100 100 000 - 0 >> 184 End-to-End_Error PO--CK 100 100 090 - 0 >> 187 Reported_Uncorrect -O--CK 100 100 000 - 0 >> 190 Temperature_Case -O---K 070 065 000 - 30 (Min/Max >> 25/35) >> 192 Unsafe_Shutdown_Count -O--CK 100 100 000 - 6 >> 194 Temperature_Internal -O---K 100 100 000 - 30 >> 197 Current_Pending_Sector -O--C- 100 100 000 - 1100 >> 199 CRC_Error_Count -OSRCK 100 100 000 - 0 >> 225 Host_Writes_32MiB -O--CK 100 100 000 - 20135 >> 226 Workld_Media_Wear_Indic -O--CK 100 100 000 - 20 >> 227 Workld_Host_Reads_Perc -O--CK 100 100 000 - 82 >> 228 Workload_Minutes -O--CK 100 100 000 - 64012 >> 232 Available_Reservd_Space PO--CK 084 084 010 - 0 >> 233 Media_Wearout_Indicator -O--CK 100 100 000 - 0 >> 234 Thermal_Throttle -O--CK 100 100 000 - 0/0 >> 241 Host_Writes_32MiB -O--CK 100 100 000 - 20135 >> 242 Host_Reads_32MiB -O--CK 100 100 000 - 92945 >> 243 NAND_Writes_32MiB -O--CK 100 100 000 - 95289 >> >> Reallocated_Sector_Ct is still increasing, but Available_Reservd_Space >> seems to be holding steady. >> >> AFAIK, we've only had one other S3610 fail, and it seemed to be a sudden >> death. The drive simply disappeared from the controller one day, and >> could no longer be detected. >> >> On 03/08/16 12:15, Jan Schermer wrote: >>> Make sure you are reading the right attribute and interpreting it right. >>> update-smart-drivedb sometimes makes wonders :) >>> >>> I wonder what isdct tool would say the drive's life expectancy is with this workload? Are you really writing ~600TB/month?? >>> >>> Jan >>> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > -- > Christian Balzer Network/Systems Engineer > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > http://www.gol.com/ > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com