Re: replace failed disk in Luminous v12.2.2

Dietmar Rieder <dietmar.rieder@xxxxxxxxxxx> · Thu, 11 Jan 2018 17:49:32 +0100

Hi Konstantin,

thanks for your answer, see my answer to Alfredo which includes your
suggestions.

~Dietmar

On 01/11/2018 12:57 PM, Konstantin Shalygin wrote:
>> Now wonder what is the correct way to replace a failed OSD block disk?
> 
> Generic way for maintenance (e.g. disk replace) is rebalance by change osd weight:
> 
> ceph osd crush reweight osdid 0
> 
> cluster migrate data "from this osd"
> When HEALTH_OK you can safe remove this OSD:
> 
> ceph osd out osd_id
> systemctl stop ceph-osd at osd_id <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> ceph osd crush remove osd_id
> ceph auth del osd_id
> ceph osd rm osd_id
> 
> 
>> I'm not sure if there is something to do with the still existing bluefs db and wal partitions on the nvme device for the failed OSD. Do they have to be zapped ? If yes, what is the best way?
> 
> 
> 1. Find nvme partition for this OSD. You can't do it in several ways. ceph-volume, by hand or with "ceph-disk list" (because is more human readable):
> 
> /dev/sda :
>  /dev/sda1 ceph data, active, cluster ceph, osd.0, block /dev/sda2, block.db /dev/nvme2n1p1, block.wal /dev/nvme2n1p2
>  /dev/sda2 ceph block, for /dev/sda1
> 
> 2. Delete partition via parted or fdisk.
> 
> fdisk -u /dev/nvme2n1
> d (delete partitions)
> enter partition number of block.db: 1
> d
> enter partition number of block.wal: 2
> w (write partition table)
> 
> 3. Deploy your new OSD.
> 

-- 
_________________________________________
D i e t m a r  R i e d e r, Mag.Dr.
Innsbruck Medical University
Biocenter - Division for Bioinformatics
Email: dietmar.rieder@xxxxxxxxxxx
Web:   http://www.icbi.at

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com