Re: Replacing a disk: Best practices?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

>         I recently had an OSD disk die, and I'm wondering what are the
> current "best practices" for replacing it.  I think I've thoroughly removed
> the old disk, both physically and logically, but I'm having trouble figuring
> out how to add the new disk into ceph.

I did this today (one disk - osd.16 - died ;-):

       # @ceph-node3
        /etc/init.d/ceph stop osd.16

        # osd.16 loeschen
        ceph osd crush remove osd.16
        ceph auth del osd.16
        ceph osd rm osd.16

        # remove hdd, plugin new hdd
        # /var/log/messages tells me
                Oct 15 09:51:09 ceph-node3 kernel: [1489736.671840] sd 0:0:0:0: [sdd] Synchronizing SCSI cache
                Oct 15 09:51:09 ceph-node3 kernel: [1489736.671873] sd 0:0:0:0: [sdd]  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
                Oct 15 09:54:56 ceph-node3 kernel: [1489963.094744] sd 0:0:8:0: Attached scsi generic sg4 type 0
                Oct 15 09:54:56 ceph-node3 kernel: [1489963.095235] sd 0:0:8:0: [sdd] 7814037168 512-byte logical blocks: (4.00 TB/3.63 TiB)
                Oct 15 09:54:57 ceph-node3 kernel: [1489963.343664] sd 0:0:8:0: [sdd] Attached SCSI disk
        --> /dev/sdd


        # check /dev/sdd
        root@ceph-node3:~#  smartctl -a /dev/sdd | less
                === START OF INFORMATION SECTION ===
                Device Model:     ST4000NM0033-9ZM170
                Serial Number:    Z1Z5LGBX
                LU WWN Device Id: 5 000c50 079577e1a
                Firmware Version: SN04
                User Capacity:    4.000.787.030.016 bytes [4,00 TB]
                ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
                  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       1
                  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
        --> ok

        # new /dev/sdd uses the absolute path:
        /dev/disk/by-id/scsi-SATA_ST4000NM0033-9Z_Z1Z5LGBX

        # create new  OSD  (with old journal partition)
        admin@ceph-admin:~/cluster1$ ceph-deploy osd create ceph-node3:sdd:/dev/disk/by-id/scsi-SATA_INTEL_SSDSC2BA1BTTV330609AU100FGN-part1
                [ceph_deploy.conf][DEBUG ] found configuration file at: /home/admin/.cephdeploy.conf
                [ceph_deploy.cli][INFO  ] Invoked (1.5.17): /usr/bin/ceph-deploy osd create ceph-node3:sdd:/dev/disk/by-id/scsi-SATA_INTEL_SSDSC2BA1BTTV330609AU100FGN-part1
                [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph-node3:/dev/sdd:/dev/disk/by-id/scsi-SATA_INTEL_SSDSC2BA1BTTV330609AU100FGN-part1
                ...
                [ceph_deploy.osd][DEBUG ] Host ceph-node3 is now ready for osd use.

        # @ceph-admin modify config
        admin@ceph-admin:~/cluster1$ ceph osd tree
                ...
        admin@ceph-admin:~/cluster1$ emacs -nw ceph.conf
                # osd16 was replaced

                [osd.16]
                        ...
                        devs         = /dev/disk/by-id/scsi-SATA_ST4000NM0033-9Z_Z1Z5LGBX-part1
                        ...

        # deploy config
        ceph-deploy  --overwrite-conf config push ceph-mon{1,2,3} ceph-node{1,2,3} ceph-admin

        # cluster-sync enablen
        ceph osd unset noout

        # check
        ceph -w

regards
Danny

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux