Hi!
I had a very similar issue a few days ago.
For me it wasn't too much of a problem since the cluster was new
without data and I could force recreate the PGs. I really hope that in
your case it won't be necessary to do the same thing.
As a first step try to reduce the min_size from 2 to 1 as suggested for
the .rgw.buckets pool and see if this can bring you cluster back to
health.
Regards,
George
On Mon, 01 Dec 2014 17:09:31 +0300, Butkeev Stas wrote:
Hi all,
I have Ceph cluster+rgw. Now I have problems with one of OSD, it's
down now. I check ceph status and see this information
[root@node-1 ceph-0]# ceph -s
cluster fc8c3ecc-ccb8-4065-876c-dc9fc992d62d
health HEALTH_WARN 4 pgs incomplete; 4 pgs stuck inactive; 4 pgs
stuck unclean
monmap e1: 3 mons at
{a=10.29.226.39:6789/0,b=10.29.226.29:6789/0,c=10.29.226.40:6789/0},
election epoch 294, quorum 0,1,2 b,a,c
osdmap e418: 6 osds: 5 up, 5 in
pgmap v23588: 312 pgs, 16 pools, 141 kB data, 594 objects
5241 MB used, 494 GB / 499 GB avail
308 active+clean
4 incomplete
Why am I having 4 pgs incomplete in bucket .rgw.buckets if I am
having replicated size 2 and min_size 2?
My osd tree
[root@node-1 ceph-0]# ceph osd tree
# id weight type name up/down reweight
-1 4 root croc
-2 4 region ru
-4 3 datacenter vol-5
-5 1 host node-1
0 1 osd.0 down 0
-6 1 host node-2
1 1 osd.1 up 1
-7 1 host node-3
2 1 osd.2 up 1
-3 1 datacenter comp
-8 1 host node-4
3 1 osd.3 up 1
-9 1 host node-5
4 1 osd.4 up 1
-10 1 host node-6
5 1 osd.5 up 1
Addition information:
[root@node-1 ceph-0]# ceph health detail
HEALTH_WARN 4 pgs incomplete; 4 pgs stuck inactive; 4 pgs stuck
unclean
pg 13.6 is stuck inactive for 1547.665758, current state incomplete,
last acting [1,3]
pg 13.4 is stuck inactive for 1547.652111, current state incomplete,
last acting [1,2]
pg 13.5 is stuck inactive for 4502.009928, current state incomplete,
last acting [1,3]
pg 13.2 is stuck inactive for 4501.979770, current state incomplete,
last acting [1,3]
pg 13.6 is stuck unclean for 4501.969914, current state incomplete,
last acting [1,3]
pg 13.4 is stuck unclean for 4502.001114, current state incomplete,
last acting [1,2]
pg 13.5 is stuck unclean for 4502.009942, current state incomplete,
last acting [1,3]
pg 13.2 is stuck unclean for 4501.979784, current state incomplete,
last acting [1,3]
pg 13.2 is incomplete, acting [1,3] (reducing pool .rgw.buckets
min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 13.6 is incomplete, acting [1,3] (reducing pool .rgw.buckets
min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 13.4 is incomplete, acting [1,2] (reducing pool .rgw.buckets
min_size from 2 may help; search ceph.com/docs for 'incomplete')
pg 13.5 is incomplete, acting [1,3] (reducing pool .rgw.buckets
min_size from 2 may help; search ceph.com/docs for 'incomplete')
[root@node-1 ceph-0]# ceph osd dump | grep 'pool'
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool
stripe_width 0
pool 1 '.rgw.root' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 34 owner
18446744073709551615 flags hashpspool stripe_width 0
pool 2 '.rgw.control' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 36 owner
18446744073709551615 flags hashpspool stripe_width 0
pool 3 '.rgw' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 38 owner
18446744073709551615 flags hashpspool stripe_width 0
pool 4 '.rgw.gc' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 39 flags
hashpspool stripe_width 0
pool 5 '.users.uid' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 40 owner
18446744073709551615 flags hashpspool stripe_width 0
pool 6 '.log' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 42 owner
18446744073709551615 flags hashpspool stripe_width 0
pool 7 '.users' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 44 flags
hashpspool stripe_width 0
pool 8 '.users.swift' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 46 flags
hashpspool stripe_width 0
pool 9 '.usage' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 48 flags
hashpspool stripe_width 0
pool 10 'test' replicated size 2 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 136 pgp_num 136 last_change 68 flags
hashpspool stripe_width 0
pool 11 '.rgw.buckets.index' replicated size 3 min_size 2
crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change
70
owner 18446744073709551615 flags hashpspool stripe_width 0
pool 12 '.rgw.buckets.extra' replicated size 3 min_size 2
crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change
72
owner 18446744073709551615 flags hashpspool stripe_width 0
pool 13 '.rgw.buckets' replicated size 2 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 383 owner
18446744073709551615 flags hashpspool stripe_width 0
pool 14 '.intent-log' replicated size 3 min_size 2 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 213 flags
hashpspool stripe_width 0
pool 15 '' replicated size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 238 flags hashpspool
stripe_width 0
--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com