Antw: Question: replacing all OSDs of one node in 3node cluster

"Steffen Weißgerber" <WeissgerberS@xxxxxxx> · Fri, 19 Feb 2016 14:37:35 +0100

Hi Daniel,

we had the same problem with a SataDom on our ceph nodes. After write errors
the root partition was mounted read only and the monitor died because logging
was not possible anymore. Instead the osd's kept running.

For minimal downtime of the node I backuped the system disk via ssh to my lokal
machine while the node is still running:

ssh al21 "dd if=/dev/sdf | gzip -1 -" | dd of=/mnt/disk1/al21.gz

unzipped die disk and made it available via losetup to check the file system.
I the file system check does not succeed you've lost and forget about the rest
here.

After that I made a copy to a new system disk attached to my local PC via dd

dd if=/mnt/disk1/al21 of=/dev/sdd bs=4096

Before shutting down the cluster node with the damaged system disk I set
the noout flag in the cluster to avoid recovery.

After changing the system disk the system started well and the osd's were
resyncing their data.

This should work indepently from die Linux distribution you use.

Regards

Steffen

>>> <Daniel.Balsiger@xxxxxxxxxxxx> schrieb am Mittwoch, 10. Februar 2016 um 17:46:
> Hi Ceph users
> 
> This is my first post on this mailing list. Hope it's the correct one. 
> Please redirect me to the right place in case it is not.
> I am running a small (3 nodes with 3 OSD and 1 monitor on each of them) Ceph 
> cluster.
> Guess what, it is used as Cinder/Glance/Nova RDB  storage for OpenStack.
> 
> I already replaced some single OSD (faulty disk) without any problems.
> Now I am facing another problem since the system disk on one of the 3 nodes 
> failed.
> So I thought to take the 3 OSDs of this node out of the cluster, set up the 
> node from scratch and  add the 3 OSDs again.
> 
> I did successfully take out the first 2 OSDs.
> Yes I hit the corner case, I did it with " ceph osd crush reweight 
> osd.<OSD#>0.0", waited for active+clean and followed by "ceph osd out <OSD#>"
> Status is now:
> 
> cluster d1af2097-8535-42f2-ba8c-0667f90cab61
>      health HEALTH_WARN
>             too many PGs per OSD (329 > max 300)
>             1 mons down, quorum 0,2 ceph0,ceph2
>      monmap e1: 3 mons at 
> {ceph0=10.0.0.30:6789/0,ceph1=10.0.0.31:6789/0,ceph2=10.0.0.32:6789/0}
>             election epoch 482, quorum 0,2 ceph0,ceph2
>      osdmap e1628: 9 osds: 9 up, 7 in
>       pgmap v2187375: 768 pgs, 3 pools, 38075 MB data, 9129 objects
>             119 GB used, 6387 GB / 6506 GB avail
>                  768 active+clean
> 
> HEALTH_WARN is because of 1 monitor down (broken node) and too many PGs per 
> OSD (329 > max 300), since I am removing OSDs
> 
> Now the problem I am facing: When I try to reweight the 3rd OSD to 0 the 
> cluster never comes to the active+clean state anymore.
> # ceph --version 
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> # ceph osd crush reweight osd.7 0.0  
> reweighted item id 7 name 'osd.7' to 0 in crush map
> # ceph -s 
> cluster d1af2097-8535-42f2-ba8c-0667f90cab61
>      health HEALTH_WARN
>             768 pgs stuck unclean
>             recovery 817/27387 objects degraded (2.983%)
>             recovery 9129/27387 objects misplaced (33.333%)
>             1 mons down, quorum 0,2 ceph0,ceph2
>      monmap e1: 3 mons at 
> {ceph0=10.0.0.30:6789/0,ceph1=10.0.0.31:6789/0,ceph2=10.0.0.32:6789/0}
>             election epoch 482, quorum 0,2 ceph0,ceph2
>      osdmap e1682: 9 osds: 9 up, 7 in; 768 remapped pgs
>       pgmap v2187702: 768 pgs, 3 pools, 38076 MB data, 9129 objects
>             119 GB used, 6387 GB / 6506 GB avail
>             817/27387 objects degraded (2.983%)
>             9129/27387 objects misplaced (33.333%)
>                  768 active+remapped
> 
> I also remarked I need to reweight to 0.7 to get in active+clean state 
> again.
>   
> Any idea how to remove this last OSD that I can setup the node again ?
> Thank you in advance, any help appreciated.
> 
> Daniel
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Klinik-Service Neubrandenburg GmbH
Allendestr. 30, 17036 Neubrandenburg
Amtsgericht Neubrandenburg, HRB 2457
Geschaeftsfuehrerin: Gudrun Kappich
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com