I am seeing the same error message with ceph health command. I am using Ubuntu 14.04 with ceph 0.79. I am using the ceph distribution that comes with the Ubuntu release. My configuration is 1 x mon 1x OSD Both the OSD and mon are on the same host. rsudarsa at rsudarsa-ce1:~/mycluster$ ceph -s cluster 5330b56b-bfbb-4ff8-aeb8-138233c2bd9a health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs stuck unclean monmap e1: 1 mons at {rsudarsa-ce2=192.168.252.196:6789/0}, election epoch 2, quorum 0 rsudarsa-ce2 osdmap e4: 1 osds: 1 up, 1 in pgmap v12: 192 pgs, 3 pools, 0 bytes data, 0 objects 6603 MB used, 856 GB / 908 GB avail 192 incomplete rsudarsa at rsudarsa-ce1:~/mycluster$ ceph osd tree # id weight type name up/down reweight -1 0.89 root default -2 0.89 host rsudarsa-ce2 0 0.89 osd.0 up 1 rsudarsa at rsudarsa-ce1:~/mycluster$ ceph osd dump epoch 4 fsid 5330b56b-bfbb-4ff8-aeb8-138233c2bd9a created 2014-05-27 10:11:33.995272 modified 2014-05-27 10:13:34.157068 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 flags hashpspool stripe_width 0 max_osd 1 osd.0 up in weight 1 up_from 4 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.252.196:6800/7071 192.168.252.196:6801/7071 192.168.252.196:6802/7071 192.168.252.196:6803/7071 exists,up 8b1c2bbb-b2f0-4974-b0f5-266c558cc732 From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of jan.zeller at id.unibe.ch Sent: Friday, May 23, 2014 6:31 AM To: michael at onlinefusion.co.uk; ceph-users at lists.ceph.com Subject: Re: pgs incomplete; pgs stuck inactive; pgs stuck unclean Thanks for your tips & tricks. This setup is now based on ubuntu 12.04, ceph version 0.80.1 Still using 1 x mon 3 x osds root at ceph-node2:~# ceph osd tree # id weight type name up/down reweight -1 0 root default -2 0 host ceph-node2 0 0 osd.0 up 1 -3 0 host ceph-node3 1 0 osd.1 up 1 -4 0 host ceph-node1 2 0 osd.2 up 1 root at ceph-node2:~# ceph -s cluster c30e1410-fe1a-4924-9112-c7a5d789d273 health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs stuck unclean monmap e1: 1 mons at {ceph-node1=192.168.123.48:6789/0}, election epoch 2, quorum 0 ceph-node1 osdmap e11: 3 osds: 3 up, 3 in pgmap v18: 192 pgs, 3 pools, 0 bytes data, 0 objects 102 MB used, 15224 MB / 15326 MB avail 192 incomplete root at ceph-node2:~# cat mycrushmap.txt # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host ceph-node2 { id -2 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.0 weight 0.000 } host ceph-node3 { id -3 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.1 weight 0.000 } host ceph-node1 { id -4 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.2 weight 0.000 } root default { id -1 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item ceph-node2 weight 0.000 item ceph-node3 weight 0.000 item ceph-node1 weight 0.000 } # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } # end crush map Is there anything wrong with it ? root at ceph-node2:~# ceph osd dump epoch 11 fsid c30e1410-fe1a-4924-9112-c7a5d789d273 created 2014-05-23 15:16:57.772981 modified 2014-05-23 15:18:17.022152 flags pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 flags hashpspool stripe_width 0 max_osd 3 osd.0 up in weight 1 up_from 4 up_thru 5 down_at 0 last_clean_interval [0,0) 192.168.123.49:6800/4714 192.168.123.49:6801/4714 192.168.123.49:6802/4714 192.168.123.49:6803/4714 exists,up bc991a4b-9e60-4759-b35a-7f58852aa804 osd.1 up in weight 1 up_from 8 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.123.50:6800/4685 192.168.123.50:6801/4685 192.168.123.50:6802/4685 192.168.123.50:6803/4685 exists,up bd099d83-2483-42b9-9dbc-7f4e4043ca60 osd.2 up in weight 1 up_from 11 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.123.53:6800/16807 192.168.123.53:6801/16807 192.168.123.53:6802/16807 192.168.123.53:6803/16807 exists,up 80a302d0-3493-4c39-b34b-5af233b32ba1 thanks Von: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] Im Auftrag von Michael Gesendet: Freitag, 23. Mai 2014 12:36 An: ceph-users at lists.ceph.com<mailto:ceph-users at lists.ceph.com> Betreff: Re: pgs incomplete; pgs stuck inactive; pgs stuck unclean 64 PG's per pool shouldn't cause any issues while there's only 3 OSD's. It'll be something to pay attention to if a lot more get added through. Your replication setup is probably anything other than host. You'll want to extract your crush map then decompile it and see if your "step" is set to osd or rack. If it's not host then change it to that and pull it in again. Check the docs on crush maps http://ceph.com/docs/master/rados/operations/crush-map/ for more info. -Michael On 23/05/2014 10:53, Karan Singh wrote: Try increasing the placement groups for pools ceph osd pool set data pg_num 128 ceph osd pool set data pgp_num 128 similarly for other 2 pools as well. - karan - -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140527/798f5ecf/attachment.htm>