Hi, I had a ceph cluster in "HEALTH_OK" state with Firefly 0.80.9. I just wanted to remove an OSD (which worked well). So after: ceph osd out 3 I waited for the rebalancing but I had "PGs stuck unclean": --------------------------------------------------------------- ~# ceph -s cluster e865b3d0-535a-4f18-9883-2793079d400b health HEALTH_WARN 180 pgs stuck unclean; recovery -3/29968 objects degraded (-0.010%) monmap e3: 3 mons at {0=10.0.2.150:6789/0,1=10.0.2.151:6789/0,2=10.0.2.152:6789/0}, election epoch 224, quorum 0,1,2 0,1,2 mdsmap e109: 1/1/1 up {0=1=up:active}, 1 up:standby osdmap e979: 21 osds: 21 up, 20 in pgmap v465311: 8704 pgs, 14 pools, 61438 MB data, 14984 objects 160 GB used, 4886 GB / 5046 GB avail -3/29968 objects degraded (-0.010%) 180 active+remapped 8524 active+clean client io 6862 B/s wr, 2 op/s --------------------------------------------------------------- The cluster stayed in this state. I have read the doc that says in this case "You may need to review settings in the Pool, PG and CRUSH Config Reference and make appropriate adjustments". But I don't see where is my mistake in my conf (I give my detailed conf below). I don't know if it's important but I have upgraded my cluster from 0.80.8 to 0.80.9 today (before my attempt to remove osd.3) with for each node of my cluster: apt-get update && apt-get upgrade restart ceph-mon-all restart ceph-osd-all restart ceph-mds-all and then: ceph osd crush set-tunable straw_calc_version 1 ceph osd crush reweight-all I had no problem with this upgraded, the rebalancing was very fast (my cluster contains few data). So, my cluster was stuck in the state described above. I could come back to HEALTH_OK with just: ceph osd in 3 But I really would like to remove this osd. Every time I try "ceph osd out 3", it reproduces my issue above. I really think my conf is OK, so I have no idea to solve my problem. Thanks in advance for your help. Regards François Lafont PS: here is my conf. I have 3 nodes with the OS Ubuntu 14.04, the kernel version 3.16.0-31-generic and ceph version 0.80.9: - node1 -> just a monitor - node2 -> a monitor, some OSDs daemons and a mds - node3 -> a monitor, some OSDs daemons and a mds Each pool have "size == 2" and "min_size == 1": --------------------------------------------------------------- ~# ceph osd dump | grep -oE '^pool.*size [0-9]+' | column -t pool 0 'data' replicated size 2 min_size 1 pool 1 'metadata' replicated size 2 min_size 1 pool 2 'rbd' replicated size 2 min_size 1 pool 3 'volumes' replicated size 2 min_size 1 pool 4 'images' replicated size 2 min_size 1 pool 5 '.rgw.root' replicated size 2 min_size 1 pool 6 '.rgw.control' replicated size 2 min_size 1 pool 7 '.rgw' replicated size 2 min_size 1 pool 8 '.rgw.gc' replicated size 2 min_size 1 pool 9 '.users.uid' replicated size 2 min_size 1 pool 10 '.users.email' replicated size 2 min_size 1 pool 11 '.users' replicated size 2 min_size 1 pool 12 '.rgw.buckets.index' replicated size 2 min_size 1 pool 13 '.rgw.buckets' replicated size 2 min_size 1 --------------------------------------------------------------- Here is my ceph.conf: --------------------------------------------------------------- ~# cat /etc/ceph/ceph.conf ### This file is managed by Puppet, don't edit it. ### [global] auth client required = cephx auth cluster required = cephx auth service required = cephx cluster network = 192.168.22.0/24 filestore xattr use omap = true fsid = xxxxxxxxxxxxxxxxxxxxxxxxxxxx mds cache size = 1000000 osd crush chooseleaf type = 1 osd journal size = 2048 osd max backfills = 2 osd pool default min size = 1 osd pool default pg num = 512 osd pool default pgp num = 512 osd pool default size = 2 osd recovery max active = 2 public network = 10.0.2.0/24 [mon.0] host = monitor-a mon addr = 10.0.2.150 [mon.1] host = silo-1 mon addr = 10.0.2.151 [mon.2] host = silo-2 mon addr = 10.0.2.152 [client.radosgw.gw1] host = ostore-1 rgw dns name = ostore rgw socket path = /var/run/ceph/ceph.radosgw.gw1.fastcgi.sock keyring = /etc/ceph/ceph.client.radosgw.gw1.keyring log file = /var/log/radosgw/client.radosgw.gw1.log [client.radosgw.gw2] host = ostore-2 rgw dns name = ostore rgw socket path = /var/run/ceph/ceph.radosgw.gw2.fastcgi.sock keyring = /etc/ceph/ceph.client.radosgw.gw2.keyring log file = /var/log/radosgw/client.radosgw.gw2.log --------------------------------------------------------------- Here is my crush map: --------------------------------------------------------------- ~# cat /tmp/crushmap.txt # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable straw_calc_version 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 device 9 osd.9 device 10 osd.10 device 11 osd.11 device 12 osd.12 device 13 osd.13 device 14 osd.14 device 15 osd.15 device 16 osd.16 device 17 osd.17 device 18 osd.18 device 19 osd.19 device 20 osd.20 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host silo-2 { id -2 # do not change unnecessarily # weight 8.800 alg straw hash 0 # rjenkins1 item osd.0 weight 0.400 item osd.2 weight 0.400 item osd.4 weight 1.000 item osd.6 weight 1.000 item osd.8 weight 1.000 item osd.10 weight 1.000 item osd.12 weight 1.000 item osd.14 weight 1.000 item osd.16 weight 1.000 item osd.18 weight 1.000 } host silo-1 { id -3 # do not change unnecessarily # weight 9.800 alg straw hash 0 # rjenkins1 item osd.1 weight 0.400 item osd.3 weight 0.400 item osd.5 weight 1.000 item osd.7 weight 1.000 item osd.9 weight 1.000 item osd.11 weight 1.000 item osd.13 weight 1.000 item osd.15 weight 1.000 item osd.17 weight 1.000 item osd.19 weight 1.000 item osd.20 weight 1.000 } root default { id -1 # do not change unnecessarily # weight 18.600 alg straw hash 0 # rjenkins1 item silo-2 weight 8.800 item silo-1 weight 9.800 } # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } # end crush map --------------------------------------------------------------- I have no problem of disk space: --------------------------------------------------------------- ~# ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 5172G 5009G 163G 3.16 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS data 0 512M 0 2468G 128 metadata 1 122M 0 2468G 51 rbd 2 8 0 2468G 1 volumes 3 51457M 0.97 2468G 13237 images 4 9345M 0.18 2468G 1186 .rgw.root 5 840 0 2468G 3 .rgw.control 6 0 0 2468G 8 .rgw 7 2352 0 2468G 13 .rgw.gc 8 0 0 2468G 32 .users.uid 9 706 0 2468G 4 .users.email 10 18 0 2468G 2 .users 11 18 0 2468G 2 .rgw.buckets.index 12 0 0 2468G 9 .rgw.buckets 13 802k 0 2468G 308 --------------------------------------------------------------- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com