On Mon, Dec 01, 2014 at 05:09:31PM +0300, Butkeev Stas wrote: > Hi all, > I have Ceph cluster+rgw. Now I have problems with one of OSD, it's down now. I check ceph status and see this information > > [root@node-1 ceph-0]# ceph -s > cluster fc8c3ecc-ccb8-4065-876c-dc9fc992d62d > health HEALTH_WARN 4 pgs incomplete; 4 pgs stuck inactive; 4 pgs stuck unclean > monmap e1: 3 mons at {a=10.29.226.39:6789/0,b=10.29.226.29:6789/0,c=10.29.226.40:6789/0}, election epoch 294, quorum 0,1,2 b,a,c > osdmap e418: 6 osds: 5 up, 5 in > pgmap v23588: 312 pgs, 16 pools, 141 kB data, 594 objects > 5241 MB used, 494 GB / 499 GB avail > 308 active+clean > 4 incomplete > > Why am I having 4 pgs incomplete in bucket .rgw.buckets if I am having replicated size 2 and min_size 2? > > My osd tree > [root@node-1 ceph-0]# ceph osd tree > # id weight type name up/down reweight > -1 4 root croc > -2 4 region ru > -4 3 datacenter vol-5 > -5 1 host node-1 > 0 1 osd.0 down 0 > -6 1 host node-2 > 1 1 osd.1 up 1 > -7 1 host node-3 > 2 1 osd.2 up 1 > -3 1 datacenter comp > -8 1 host node-4 > 3 1 osd.3 up 1 > -9 1 host node-5 > 4 1 osd.4 up 1 > -10 1 host node-6 > 5 1 osd.5 up 1 > > Addition information: > > [root@node-1 ceph-0]# ceph health detail > HEALTH_WARN 4 pgs incomplete; 4 pgs stuck inactive; 4 pgs stuck unclean > pg 13.6 is stuck inactive for 1547.665758, current state incomplete, last acting [1,3] > pg 13.4 is stuck inactive for 1547.652111, current state incomplete, last acting [1,2] > pg 13.5 is stuck inactive for 4502.009928, current state incomplete, last acting [1,3] > pg 13.2 is stuck inactive for 4501.979770, current state incomplete, last acting [1,3] > pg 13.6 is stuck unclean for 4501.969914, current state incomplete, last acting [1,3] > pg 13.4 is stuck unclean for 4502.001114, current state incomplete, last acting [1,2] > pg 13.5 is stuck unclean for 4502.009942, current state incomplete, last acting [1,3] > pg 13.2 is stuck unclean for 4501.979784, current state incomplete, last acting [1,3] > pg 13.2 is incomplete, acting [1,3] (reducing pool .rgw.buckets min_size from 2 may help; search ceph.com/docs for 'incomplete') > pg 13.6 is incomplete, acting [1,3] (reducing pool .rgw.buckets min_size from 2 may help; search ceph.com/docs for 'incomplete') > pg 13.4 is incomplete, acting [1,2] (reducing pool .rgw.buckets min_size from 2 may help; search ceph.com/docs for 'incomplete') > pg 13.5 is incomplete, acting [1,3] (reducing pool .rgw.buckets min_size from 2 may help; search ceph.com/docs for 'incomplete') > > [root@node-1 ceph-0]# ceph osd dump | grep 'pool' > pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 > pool 1 '.rgw.root' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 34 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 2 '.rgw.control' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 36 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 3 '.rgw' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 38 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 4 '.rgw.gc' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 39 flags hashpspool stripe_width 0 > pool 5 '.users.uid' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 40 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 6 '.log' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 42 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 7 '.users' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 44 flags hashpspool stripe_width 0 > pool 8 '.users.swift' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 46 flags hashpspool stripe_width 0 > pool 9 '.usage' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 48 flags hashpspool stripe_width 0 > pool 10 'test' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 136 pgp_num 136 last_change 68 flags hashpspool stripe_width 0 > pool 11 '.rgw.buckets.index' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 70 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 12 '.rgw.buckets.extra' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 72 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 13 '.rgw.buckets' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 383 owner 18446744073709551615 flags hashpspool stripe_width 0 > pool 14 '.intent-log' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 213 flags hashpspool stripe_width 0 > pool 15 '' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 238 flags hashpspool stripe_width 0 > Some of your pools have size = 3. > > -- > With regards, > Stanislav Butkeev > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Tomasz Kuzemko tomasz.kuzemko@xxxxxxx
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com