Re: Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

B L <super.iterator@xxxxxxxxx> · Tue, 10 Feb 2015 13:24:50 +0200

I will try to change the replication size now as you suggested .. but how is that related to the non-healthy cluster?

On Feb 10, 2015, at 1:22 PM, B L <super.iterator@xxxxxxxxx> wrote:

Hi Vickie,

My OSD tree looks like this:

ceph@ceph-node3:/home/ubuntu$ ceph osd tree
# id	weight	type name	up/down	reweight
-1	0	root default
-2	0		host ceph-node1
0	0			osd.0	up	1
1	0			osd.1	up	1
-3	0		host ceph-node3
2	0			osd.2	up	1
3	0			osd.3	up	1
-4	0		host ceph-node2
4	0			osd.4	up	1
5	0			osd.5	up	1

On Feb 10, 2015, at 1:18 PM, Vickie ch <mika.leaf666@xxxxxxxxx> wrote:

Hi Beanos：
BTW, if your cluster just for test. You may try to reduce replica size and min_size. 
"ceph osd pool set rbd size 2;ceph osd pool set data size 2;ceph osd pool set metadata size 2 "
"ceph osd pool set rbd min_size 1;ceph osd pool set data min_size 1;ceph osd pool set metadata min_size 1"
Open another terminal and use command "ceph -w" watch pg and pgs status .

Best wishes,
Vickie

2015-02-10 19:16 GMT+08:00 Vickie ch <mika.leaf666@xxxxxxxxx>:
Hi Beanos：
So you have 3 OSD servers and each of them have 2 disks. 
I have a question. What result of "ceph osd tree". Look like the osd status is "down".

Best wishes,
Vickie

2015-02-10 19:00 GMT+08:00 B L <super.iterator@xxxxxxxxx>:
Here is the updated direct copy/paste dump

eph@ceph-node1:~$ ceph osd dump
epoch 25
fsid 17bea68b-1634-4cd1-8b2a-00a60ef4761d
created 2015-02-08 16:59:07.050875
modified 2015-02-09 22:35:33.191218
flags
pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 64 last_change 24 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
max_osd 6
osd.0 up   in  weight 1 up_from 4 up_thru 17 down_at 0 last_clean_interval [0,0) 172.31.0.84:6800/11739 172.31.0.84:6801/11739 172.31.0.84:6802/11739 172.31.0.84:6803/11739 exists,up 765f5066-d13e-4a9e-a446-8630ee06e596
osd.1 up   in  weight 1 up_from 7 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.0.84:6805/12279 172.31.0.84:6806/12279 172.31.0.84:6807/12279 172.31.0.84:6808/12279 exists,up e1d073e5-9397-4b63-8b7c-a4064e430f7a
osd.2 up   in  weight 1 up_from 10 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6800/5517 172.31.3.57:6801/5517 172.31.3.57:6802/5517 172.31.3.57:6803/5517 exists,up 5af5deed-7a6d-4251-aa3c-819393901d1f
osd.3 up   in  weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.57:6805/6043 172.31.3.57:6806/6043 172.31.3.57:6807/6043 172.31.3.57:6808/6043 exists,up 958f37ab-b434-40bd-87ab-3acbd3118f92
osd.4 up   in  weight 1 up_from 16 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6800/5106 172.31.3.56:6801/5106 172.31.3.56:6802/5106 172.31.3.56:6803/5106 exists,up ce5c0b86-96be-408a-8022-6397c78032be
osd.5 up   in  weight 1 up_from 22 up_thru 0 down_at 0 last_clean_interval [0,0) 172.31.3.56:6805/7019 172.31.3.56:6806/7019 172.31.3.56:6807/7019 172.31.3.56:6808/7019 exists,up da67b604-b32a-44a0-9920-df0774ad2ef3

On Feb 10, 2015, at 12:55 PM, B L <super.iterator@xxxxxxxxx> wrote:

On Feb 10, 2015, at 12:37 PM, B L <super.iterator@xxxxxxxxx> wrote:

Hi Vickie,

Thanks for your reply!

You can find the dump in this link:

https://gist.github.com/anonymous/706d4a1ec81c93fd1eca

Thanks!
B.

On Feb 10, 2015, at 12:23 PM, Vickie ch <mika.leaf666@xxxxxxxxx> wrote:

Hi Beanos：
   Would you post the reult of "$ceph osd dump"？

Best wishes,
Vickie

2015-02-10 16:36 GMT+08:00 B L <super.iterator@xxxxxxxxx>:
Having problem with my fresh non-healthy cluster, my cluster status summary shows this:
ceph@ceph-node1:~$ ceph -s

    cluster 17bea68b-1634-4cd1-8b2a-00a60ef4761d
     health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean; pool data pg_num 128 > pgp_num 64
     monmap e1: 1 mons at {ceph-node1=172.31.0.84:6789/0}, election epoch 2, quorum 0 ceph-node1
     osdmap e25: 6 osds: 6 up, 6 in
      pgmap v82: 256 pgs, 3 pools, 0 bytes data, 0 objects
            198 MB used, 18167 MB / 18365 MB avail
                 192 incomplete
                  64 creating+incomplete

Where shall I start troubleshooting this?

P.S. I’m new to CEPH.

Thanks!
Beanos

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com