Re: Power outages!!! help!

Ronny Aasen <ronny+ceph-users@xxxxxxxx> · Sun, 3 Sep 2017 13:55:49 +0200



    I would not even attempt to connect a
      recovered drive to ceph, especially not one that have had xfs
      errors and corruption.  

      
      your pg's that are undersized lead me to belive you still need to
      either expand, with more disks, or nodes. or that you need to set
      

      osd crush chooseleaf type = 0 

      to let ceph pick 2 disks on the same node as a valid object
      placement.  (temporary until you get 2 balanced nodes) generally
      let ceph selfheal as much as possible (no misplaced or degraded
      objects)  this require that ceph have space for the recovery. 

      i would run with size=2 min_size=2  

      
      you should also look at the 7 shrub errors. they indicate that
      there can be other drives with issues, you want to locate where
      those inconsistent objects are, and fix them. read this page about
      fixing scrub errors.
      http://ceph.com/geen-categorie/ceph-manually-repair-object/

      
      then you would sit with the 103 unfound objects, and those you
      should try to recover from the recovered drive. 

      by using the ceph-objectstore-tool
        export/import  to try and export pg's missing
      objects  to a dedicated temporary added import drive.

      the import drive does not need to be very large. since you can do
      one and one pg at the time. and you should only recover pg's that
      contain unfound objects. there is realy only 103 unfound objects
      that you need to recover. 

      once the recovery is compleate you can wipe the functioning
      recovery drive, and install it as a new osd to the cluster.

      
      kind regards

      Ronny Aasen

      
      On 03.09.2017 06:20, hjcho616 wrote:

    
        I checked with
            ceph-2, 3, 4, 5 so I figured it was safe to assume that
            superblock file is the same.  I copied it over and started
            OSD.  It still fails with the same error message.  Looks
            like when I updated to 10.2.9, some osd needs to be updated
            and that process is not finding the data it needs?  What can
            I do about this situation?
        

        2017-09-01
          22:27:35.590041 7f68837e5800  1
          filestore(/var/lib/ceph/osd/ceph-0) upgrade
        2017-09-01
          22:27:35.590149 7f68837e5800 -1
          filestore(/var/lib/ceph/osd/ceph-0) could not find
          #-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or
          directory
        

        Regards,
        Hong
        

               On Friday,
                  September 1, 2017 11:10 PM, hjcho616
                  <hjcho616@xxxxxxxxx> wrote:

                
                      Just realized there
                          is a file called superblock in the ceph
                          directory.  ceph-1 and ceph-2's superblock
                          file is identical, ceph-6 and ceph-7 are
                          identical, but not between the two groups.
                           When I originally created the OSDs, I created
                          ceph-0 through 5.  Can superblock file be
                          copied over from ceph-1 to ceph-0?
                      

                      Hmm..
                        it appears to be doing something in the
                        background even though osd.0 is down.  ceph
                        health output is changing!
                      #
                        ceph health
                      HEALTH_ERR
                        40 pgs are stuck inactive for more than 300
                        seconds; 14 pgs backfill_wait; 21 pgs degraded;
                        10 pgs down; 2 pgs inconsistent; 10 pgs peering;
                        3 pgs recovering; 2 pgs recovery_wait; 30 pgs
                        stale; 21 pgs stuck degraded; 10 pgs stuck
                        inactive; 30 pgs stuck stale; 45 pgs stuck
                        unclean; 16 pgs stuck undersized; 16 pgs
                        undersized; 2 requests are blocked > 32 sec;
                        recovery 221826/2473662 objects degraded
                        (8.968%); recovery 254711/2473662 objects
                        misplaced (10.297%); recovery 103/2251966
                        unfound (0.005%); 7 scrub errors; mds cluster is
                        degraded; no legacy OSD present but
                        'sortbitwise' flag is not set
                      

                      Regards,
                      Hong
                      

                                  On Friday, September 1, 2017 10:37 PM,
                                  hjcho616 <hjcho616@xxxxxxxxx>
                                  wrote:

                                
                                      Tried
                                        connecting recovered osd.  Looks
                                        like some of the files in the
                                        lost+found are super blocks.
                                         Below is the log.  What can I
                                        do about this?
                                      

                                      2017-09-01
                                        22:27:27.634228 7f68837e5800  0
                                        set uid:gid to 1001:1001
                                        (ceph:ceph)
                                      2017-09-01
                                        22:27:27.634245 7f68837e5800  0
                                        ceph version 10.2.9
                                        (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0),
                                        process ceph-osd, pid 5432
                                      2017-09-01
                                        22:27:27.635456 7f68837e5800  0
                                        pidfile_write: ignore empty
                                        --pid-file
                                      2017-09-01
                                        22:27:27.646849 7f68837e5800  0
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
                                      2017-09-01
                                        22:27:27.647077 7f68837e5800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:
                                        FIEMAP ioctl is disabled via
                                        'filestore fiemap' config option
                                      2017-09-01
                                        22:27:27.647080 7f68837e5800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:
                                        SEEK_DATA/SEEK_HOLE is disabled
                                        via 'filestore seek data hole'
                                        config option
                                      2017-09-01
                                        22:27:27.647091 7f68837e5800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:
                                        splice is supported
                                      2017-09-01
                                        22:27:27.678937 7f68837e5800  0
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:
                                        syncfs(2) syscall fully
                                        supported (by glibc and kernel)
                                      2017-09-01
                                        22:27:27.679044 7f68837e5800  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is
                                        disabled by conf
                                      2017-09-01
                                        22:27:27.680718 7f68837e5800  1
                                        leveldb: Recovering log #28054
                                      2017-09-01
                                        22:27:27.804501 7f68837e5800  1
                                        leveldb: Delete type=0 #28054
                                      

                                      2017-09-01
                                        22:27:27.804579 7f68837e5800  1
                                        leveldb: Delete type=3 #28053
                                      

                                      2017-09-01
                                        22:27:35.586725 7f68837e5800  0
filestore(/var/lib/ceph/osd/ceph-0) mount: enabling WRITEAHEAD journal
                                        mode: checkpoint is not enabled
                                      2017-09-01
                                        22:27:35.587689 7f68837e5800  1
                                        journal _open
                                        /var/lib/ceph/osd/ceph-0/journal
                                        fd 18: 9998729216 bytes, block
                                        size 4096 bytes, directio = 1,
                                        aio = 1
                                      2017-09-01
                                        22:27:35.589631 7f68837e5800  1
                                        journal _open
                                        /var/lib/ceph/osd/ceph-0/journal
                                        fd 18: 9998729216 bytes, block
                                        size 4096 bytes, directio = 1,
                                        aio = 1
                                      2017-09-01
                                        22:27:35.590041 7f68837e5800  1
filestore(/var/lib/ceph/osd/ceph-0) upgrade
                                      2017-09-01
                                        22:27:35.590149 7f68837e5800 -1
filestore(/var/lib/ceph/osd/ceph-0) could not find
                                        #-1:7b3f43c4:::osd_superblock:0#
                                        in index: (2) No such file or
                                        directory
                                      2017-09-01
                                        22:27:35.590158 7f68837e5800 -1
                                        osd.0 0 OSD::init() : unable to
                                        read osd superblock
                                      2017-09-01
                                        22:27:35.590547 7f68837e5800  1
                                        journal close
                                        /var/lib/ceph/osd/ceph-0/journal
                                      2017-09-01
                                        22:27:35.611595 7f68837e5800 -1
                                        ^[[0;31m ** ERROR: osd init
                                        failed: (22) Invalid
                                        argument^[[0m
                                      

                                      Recovered
                                        drive is mounted on
                                        /var/lib/ceph/osd/ceph-0.
                                      #
                                        df
                                      Filesystem
                                             1K-blocks      Used
                                         Available Use% Mounted on
                                      udev
                                                       10240         0  
                                           10240   0% /dev
                                      tmpfs
                                                    1584780      9172  
                                         1575608   1% /run
                                      /dev/sda1
                                               15247760   9319048  
                                         5131120  65% /
                                      tmpfs
                                                    3961940         0  
                                         3961940   0% /dev/shm
                                      tmpfs
                                                       5120         0  
                                            5120   0% /run/lock
                                      tmpfs
                                                    3961940         0  
                                         3961940   0% /sys/fs/cgroup
                                      /dev/sdb1
                                             1952559676 634913968
                                        1317645708  33%
                                        /var/lib/ceph/osd/ceph-0
                                      /dev/sde1
                                             1952559676 640365952
                                        1312193724  33%
                                        /var/lib/ceph/osd/ceph-6
                                      /dev/sdd1
                                             1952559676 712018768
                                        1240540908  37%
                                        /var/lib/ceph/osd/ceph-2
                                      /dev/sdc1
                                             1952559676 755827440
                                        1196732236  39%
                                        /var/lib/ceph/osd/ceph-1
                                      /dev/sdf1
                                              312417560  42538060
                                         269879500  14%
                                        /var/lib/ceph/osd/ceph-7
                                      tmpfs
                                                     792392         0  
                                          792392   0% /run/user/0
                                      #
                                        cd /var/lib/ceph/osd/ceph-0
                                      #
                                        ls
                                      activate.monmap
                                         current  journal_uuid  magic  
                                               superblock  whoami
                                      active
                                                  fsid     keyring      
                                        ready          sysvinit
                                      ceph_fsid
                                               journal  lost+found  
                                         store_version  type
                                      

                                      Regards,
                                      Hong
                                      

                                                  On Friday, September
                                                  1, 2017 2:59 PM,
                                                  hjcho616
                                                  <hjcho616@xxxxxxxxx>
                                                  wrote:

                                                
                                                      Found the
                                                          partition,
                                                          wasn't able to
                                                          mount the
                                                          partition
                                                          right away...
                                                          Did a
                                                          xfs_repair on
                                                          that drive.  
                                                      

                                                      Got bunch of
                                                          messages like
                                                          this.. =(
                                                      entry
"100000a89fd.00000000__head_AE319A25__0" in shortform directory
                                                        845908970
                                                        references
                                                        non-existent
                                                        inode 605294241
                                                                      
                                                      junking
                                                        entry
                                                        "100000a89fd.00000000__head_AE319A25__0"
                                                        in directory
                                                        inode 845908970
                                                                  
                                                      
                                                      Was
                                                        able to mount.
                                                         lost+found has
                                                        lots of files
                                                        there. =P
                                                         Running du
                                                        seems to show OK
                                                        files in current
                                                        directory.
                                                      

                                                      Will
                                                        it be safe to
                                                        attach this one
                                                        back to the
                                                        cluster?  Is
                                                        there a way to
                                                        specify to use
                                                        this drive if
                                                        the data is
                                                        missing? =)  Or
                                                        am I being
                                                        paranoid?  Just
                                                        plug it? =)
                                                      

                                                      Regards,
                                                      Hong
                                                      

                                                           On
                                                          Friday,
                                                          September 1,
                                                          2017 9:01 AM,
                                                          hjcho616
                                                          <hjcho616@xxxxxxxxx>
                                                          wrote:

                                                          
                                                          Looks
                                                          like it has
                                                          been
                                                          rescued...
                                                          Only 1 error
                                                          as we saw
                                                          before in the
                                                          smart log!
                                                          #
                                                          ddrescue -f
                                                          /dev/sda
                                                          /dev/sdc
                                                          ./rescue.log
                                                          GNU
                                                          ddrescue 1.21
                                                          Press
                                                          Ctrl-C to
                                                          interrupt
                                                           
                                                             ipos:  
                                                           1508 GB,
                                                          non-trimmed:  
                                                               0 B,
                                                           current rate:
                                                                0 B/s
                                                           
                                                             opos:  
                                                           1508 GB,
                                                          non-scraped:  
                                                               0 B,
                                                           average rate:
                                                           88985 kB/s
                                                          non-tried:
                                                                 0 B,  
                                                            errsize:    
                                                          4096 B,    
                                                           run time:  6h
                                                          14m 40s
                                                           
                                                          rescued:  
                                                           2000 GB,    
                                                           errors:      
                                                           1,  remaining
                                                          time:        
                                                          n/a
                                                          percent
                                                          rescued:
                                                           99.99%    
                                                           time since
                                                          last
                                                          successful
                                                          read:        
                                                          39s
                                                          Finished              
                                                                  
                                                          
                                                          Still missing
                                                          partition in
                                                          the new drive.
                                                          =P  I found
                                                          this util
                                                          called
                                                          testdisk for
                                                          broken
                                                          partition
                                                          tables.  Will
                                                          try that
                                                          tonight. =P
                                                          

                                                          Regards,
                                                          Hong
                                                          

                                                           On
                                                          Wednesday,
                                                          August 30,
                                                          2017 9:18 AM,
                                                          Ronny Aasen
                                                          <ronny+ceph-users@xxxxxxxx>
                                                          wrote:

                                                          
                                                          On
                                                          30.08.2017
                                                          15:32, Steve
                                                          Taylor wrote:

                                                          
                                                          I'm not
                                                          familiar with
                                                          dd_rescue, but
                                                          I've just been
                                                          reading about
                                                          it. I'm not
                                                          seeing any
                                                          features that
                                                          would be
                                                          beneficial in
                                                          this scenario
                                                          that aren't
                                                          also available
                                                          in dd. What
                                                          specific
                                                          features give
                                                          it "really a
                                                          far better
                                                          chance
                                                          of restoring a copy of your disk"
                                                          than dd? I'm
                                                          always
                                                          interested in
                                                          learning about
                                                          new recovery
                                                          tools.
                                                          

                                                          i see i wrote
                                                          dd_rescue from
                                                          old habit, but
                                                          the package
                                                          one should use
                                                          on debian is
                                                          gddrescue or
                                                          also called
                                                          gnu ddrecue. 

                                                          
                                                          this page have
                                                          some details
                                                          on the
                                                          differences on
                                                          dd vs the
                                                          ddrescue
                                                          variants. 

                                                          http://www.toad.com/gnu/sysadmin/index.html#ddrescue

                                                          
                                                          kind regards

                                                          Ronny Aasen

                                                          
                                                          Steve Taylor | Senior Software Engineer
                                                          | StorageCraft
                                                          Technology
                                                          Corporation

                                                          380 Data Drive Suite 300 | Draper | Utah | 84020

                                                          Office:
                                                          801.871.2799 |
                                                          
                                                          
                                                          If you are not the intended recipient of this message or
                                                          received it
                                                          erroneously,
                                                          please notify
                                                          the sender and
                                                          delete it,
                                                          together with
                                                          any
                                                          attachments,
                                                          and be advised
                                                          that any
                                                          dissemination
                                                          or copying of
                                                          this message
                                                          is prohibited.
                                                          
                                                          
                                                          On Tue,
                                                          2017-08-29 at
                                                          21:49 +0200,
                                                          Willem Jan
                                                          Withagen
                                                          wrote:
                                                          
                                                          On 29-8-2017 19:12, Steve Taylor wrote:

                                                          
                                                          Hong,
                                                          Probably your
                                                          best chance at
                                                          recovering any
                                                          data without
                                                          special,
                                                          expensive,
                                                          forensic
                                                          procedures is
                                                          to perform a
                                                          dd from
                                                          /dev/sdb to
                                                          somewhere else
                                                          large enough
                                                          to hold a full
                                                          disk image and
                                                          attempt to
                                                          repair that.
                                                          You'll want to
                                                          use
                                                          'conv=noerror'
                                                          with your dd
                                                          command
                                                          since your
                                                          disk is
                                                          failing. Then
                                                          you could
                                                          either
                                                          re-attach the
                                                          OSD
                                                          from the new
                                                          source or
                                                          attempt to
                                                          retrieve
                                                          objects from
                                                          the filestore
                                                          on it.
                                                          
                                                          Like somebody
                                                          else already
                                                          pointed out
                                                          In problem
                                                          "cases like
                                                          disk, use
                                                          dd_rescue.
                                                          It has really
                                                          a far better
                                                          chance of
                                                          restoring a
                                                          copy of your
                                                          disk
                                                          --WjW
                                                          
                                                          I have
                                                          actually done
                                                          this before by
                                                          creating an
                                                          RBD that
                                                          matches the
                                                          disk size,
                                                          performing the
                                                          dd, running
                                                          xfs_repair,
                                                          and eventually
                                                          adding it back
                                                          to the cluster
                                                          as an OSD.
                                                          RBDs as OSDs
                                                          is certainly a
                                                          temporary
                                                          arrangement
                                                          for repair
                                                          only, but I'm
                                                          happy to
                                                          report that it
                                                          worked
                                                          flawlessly in
                                                          my case. I was
                                                          able to weight
                                                          the OSD to 0,
                                                          offload all of
                                                          its data, then
                                                          remove it for
                                                          a full
                                                          recovery, at
                                                          which
                                                          point I just
                                                          deleted the
                                                          RBD.
                                                          The
                                                          possibilities
                                                          afforded by
                                                          Ceph inception
                                                          are endless. ☺
                                                          Steve Taylor |
                                                          Senior
                                                          Software
                                                          Engineer |
                                                          StorageCraft
                                                          Technology
                                                          Corporation
                                                          380 Data Drive
                                                          Suite 300
                                                          | Draper |
                                                          Utah | 84020
                                                          Office:
                                                          801.871.2799 |
                                                          If you are not
                                                          the intended
                                                          recipient of
                                                          this message
                                                          or received it
                                                          erroneously,
                                                          please notify
                                                          the sender and
                                                          delete it,
                                                          together with
                                                          any
                                                          attachments,
                                                          and be advised
                                                          that any
                                                          dissemination
                                                          or copying of
                                                          this message
                                                          is prohibited.
                                                          On Mon,
                                                          2017-08-28 at
                                                          23:17 +0100,
                                                          Tomasz
                                                          Kusmierz
                                                          wrote:
                                                          
                                                          Rule of thumb
                                                          with batteries
                                                          is:
                                                          - more “proper
                                                          temperature”
                                                          you run them
                                                          at the more
                                                          life you get
                                                          out
                                                          of them
                                                          - more battery
                                                          is overpowered
                                                          for your
                                                          application
                                                          the longer it
                                                          will
                                                          survive. 
                                                          Get your self
                                                          a LSI 94**
                                                          controller and
                                                          use it as HBA
                                                          and you will
                                                          be
                                                          fine. but get
                                                          MORE DRIVES
                                                          !!!!! … 
                                                          
                                                          On 28 Aug
                                                          2017, at
                                                          23:10,
                                                          hjcho616 <hjcho616@xxxxxxxxx>
                                                          wrote:
                                                          Thank you
                                                          Tomasz and
                                                          Ronny.  I'll
                                                          have to order
                                                          some hdd soon
                                                          and
                                                          try these out.
                                                           Car battery
                                                          idea is nice!
                                                           I may try
                                                          that.. =)  Do
                                                          they last
                                                          longer?  Ones
                                                          that fit the
                                                          UPS original
                                                          battery spec
                                                          didn't last
                                                          very long...
                                                          part of the
                                                          reason why I
                                                          gave up on
                                                          them..
                                                          =P  My wife
                                                          probably won't
                                                          like the idea
                                                          of car battery
                                                          hanging out
                                                          though ha!
                                                          The OSD1 (one
                                                          with mostly ok
                                                          OSDs, except
                                                          that smart
                                                          failure)
                                                          motherboard
                                                          doesn't have
                                                          any additional
                                                          SATA
                                                          connectors
                                                          available.
                                                           Would it be
                                                          safe to add
                                                          another OSD
                                                          host?
                                                          Regards,
                                                          Hong
                                                          On Monday,
                                                          August 28,
                                                          2017 4:43 PM,
                                                          Tomasz
                                                          Kusmierz <tom.kusmierz@g
                                                          mail.com>
                                                          wrote:
                                                          Sorry for
                                                          being brutal …
                                                          anyway 
                                                          1. get the
                                                          battery for
                                                          UPS ( a car
                                                          battery will
                                                          do as well,
                                                          I’ve
                                                          moded on ups
                                                          in the past
                                                          with truck
                                                          battery and it
                                                          was working
                                                          like
                                                          a charm :D )
                                                          2. get spare
                                                          drives and put
                                                          those in
                                                          because your
                                                          cluster CAN
                                                          NOT
                                                          get out of
                                                          error due to
                                                          lack of space
                                                          3. Follow
                                                          advice of
                                                          Ronny Aasen on
                                                          hot to recover
                                                          data from hard
                                                          drives 
                                                          4 get cooling
                                                          to drives or
                                                          you will loose
                                                          more ! 
                                                          
                                                          On 28 Aug
                                                          2017, at
                                                          22:39,
                                                          hjcho616 <hjcho616@xxxxxxxxx>
                                                          wrote:
                                                          Tomasz,
                                                          Those machines
                                                          are behind a
                                                          surge
                                                          protector.
                                                           Doesn't
                                                          appear to
                                                          be a good one!
                                                           I do have a
                                                          UPS... but it
                                                          is my fault...
                                                          no
                                                          battery.
                                                           Power was
                                                          pretty
                                                          reliable for a
                                                          while... and
                                                          UPS was
                                                          just beeping
                                                          every chance
                                                          it had,
                                                          disrupting
                                                          some sleep..
                                                          =P  So
                                                          running on
                                                          surge
                                                          protector
                                                          only.  I am
                                                          running this
                                                          in home
                                                          environment.  
                                                          So far, HDD
                                                          failures have
                                                          been very rare
                                                          for this
                                                          environment.
                                                          =)  It just
                                                          doesn't get
                                                          loaded as
                                                          much!  I am
                                                          not
                                                          sure what to
                                                          expect, seeing
                                                          that "unfound"
                                                          and just a
                                                          feeling of
                                                          possibility of
                                                          maybe getting
                                                          OSD back made
                                                          me excited
                                                          about it.
                                                          =) Thanks for
                                                          letting me
                                                          know what
                                                          should be the
                                                          priority.  I
                                                          just lack
                                                          experience and
                                                          knowledge in
                                                          this. =)
                                                          Please do
                                                          continue
                                                          to guide me
                                                          though this. 
                                                          Thank you for
                                                          the decode of
                                                          that smart
                                                          messages!  I
                                                          do agree that
                                                          looks like it
                                                          is on its way
                                                          out.  I would
                                                          like to know
                                                          how to get
                                                          good portion
                                                          of it back if
                                                          possible. =)
                                                          I think I just
                                                          set the size
                                                          and min_size
                                                          to 1.
                                                          # ceph osd
                                                          lspools
                                                          0 data,1
                                                          metadata,2
                                                          rbd,
                                                          # ceph osd
                                                          pool set rbd
                                                          size 1
                                                          set pool 2
                                                          size to 1
                                                          # ceph osd
                                                          pool set rbd
                                                          min_size 1
                                                          set pool 2
                                                          min_size to 1
                                                          Seems to be
                                                          doing some
                                                          backfilling
                                                          work.
                                                          # ceph health
                                                          HEALTH_ERR 22
                                                          pgs are stuck
                                                          inactive for
                                                          more than 300
                                                          seconds; 2
                                                          pgs
                                                          backfill_toofull;
                                                          74 pgs
                                                          backfill_wait;
                                                          3 pgs
                                                          backfilling;
                                                          108 pgs
                                                          degraded; 6
                                                          pgs down; 6
                                                          pgs
                                                          inconsistent;
                                                          6 pgs peering;
                                                          7 pgs
                                                          recovery_wait;
                                                          16 pgs stale;
                                                          108 pgs stuck
                                                          degraded; 6
                                                          pgs
                                                          stuck
                                                          inactive; 16
                                                          pgs stuck
                                                          stale; 130 pgs
                                                          stuck unclean;
                                                          101
                                                          pgs stuck
                                                          undersized;
                                                          101 pgs
                                                          undersized; 1
                                                          requests are
                                                          blocked
                                                          
                                                          32 sec;
                                                          recovery
                                                          1790657/4502340
                                                          objects
                                                          degraded
                                                          (39.772%);
                                                          
                                                          recovery
                                                          641906/4502340
                                                          objects
                                                          misplaced
                                                          (14.257%);
                                                          recovery
                                                          147/2251990
                                                          unfound
                                                          (0.007%); 50
                                                          scrub errors;
                                                          mds cluster is
                                                          degraded; no
                                                          legacy OSD
                                                          present but
                                                          'sortbitwise'
                                                          flag is not
                                                          set
                                                          Regards,
                                                          Hong
                                                          On Monday,
                                                          August 28,
                                                          2017 4:18 PM,
                                                          Tomasz
                                                          Kusmierz
                                                          <tom.kusmierz
                                                          @gmail.com>
                                                          wrote:
                                                          So to decode
                                                          few things
                                                          about your
                                                          disk:
                                                            1
                                                          Raw_Read_Error_Rate 
                                                            0x002f  100 
                                                          100  051   
                                                          Pre-fail 
                                                          Always      - 
                                                              37
                                                          37 read erros
                                                          and only one
                                                          sector marked
                                                          as pending -
                                                          fun disk
                                                          :/ 
                                                          181
                                                          Program_Fail_Cnt_Total 
                                                          0x0022  099 
                                                          099  000   
                                                          Old_age 
                                                          Always      - 
                                                              35325174
                                                          So firmware
                                                          has quite few
                                                          bugs, that’s
                                                          nice
                                                          191
                                                          G-Sense_Error_Rate 
                                                              0x0022 
                                                          100  100  000 
                                                            Old_age 
                                                          Always      - 
                                                              2855
                                                          disk was
                                                          thrown around
                                                          while
                                                          operational
                                                          even more
                                                          nice.
                                                          194
                                                          Temperature_Celsius 
                                                            0x0002  047 
                                                          041  000   
                                                          Old_age 
                                                          Always      - 
                                                              53
                                                          (Min/Max
                                                          15/59)
                                                          if your disk
                                                          passes 50 you
                                                          should not
                                                          consider using
                                                          it, high
                                                          temperatures
                                                          demagnetise
                                                          plate layer
                                                          and you will
                                                          see more
                                                          errors
                                                          in very near
                                                          future.
                                                          197
                                                          Current_Pending_Sector 
                                                          0x0032  100 
                                                          100  000   
                                                          Old_age 
                                                          Always      - 
                                                              1
                                                          as mentioned
                                                          before :)
                                                          200
                                                          Multi_Zone_Error_Rate 
                                                          0x002a  100 
                                                          100  000   
                                                          Old_age 
                                                          Always      - 
                                                              4222
                                                          your heads
                                                          keep missing
                                                          tracks … bent
                                                          ? I don’t even
                                                          know how to
                                                          comment here.
                                                          generally fun
                                                          drive you’ve
                                                          got there …
                                                          rescue as much
                                                          as you can
                                                          and throw it
                                                          away !!!
                                                          
                                                          
_______________________________________________
                                                          ceph-users
                                                          mailing list
                                                          ceph-users@xxxxxxxxxxxxxx
                                                          http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
                                                          
                                                          
                                                          _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

                                                          
                                                          _______________________________________________

                                                          ceph-users
                                                          mailing list

                                                          ceph-users@xxxxxxxxxxxxxx

                                                          http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

                                                          
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com