I am currently trying to figure out how to debug pgs issues myself and
the debugging documentation I have found has not been that helpful. In
my case the underlying problem is probably ZFS which I am using for my
OSDs, but it would be nice to be able to recover what I can. My health
output is:
# ceph health
HEALTH_WARN 39 pgs backfill; 26 pgs backfilling; 297 pgs degraded; 88
pgs down; 89 pgs peering; 19 pgs recovering; 35 pgs recovery_wait; 66
pgs stale; 96 pgs stuck inactive; 66 pgs stuck stale; 690 pgs stuck
unclean; 3 requests are blocked > 32 sec; recovery 86428/515041 objects
degraded (16.781%); pool iscsi pg_num 250 > pgp_num 100; pool iscsi has
too few pgs
Also if I try to do a "rbd -p <pool> ls" on any of my pools, the
command hangs.
If I figure out anything, I will let you known.
I have an issue with incomplete pgs, I’ve tried repairing it but no
such luck. Any ideas what to check?
Output from ‘ceph health detail’
HEALTH_ERR 2 pgs inconsistent; 1 pgs recovering; 1 pgs stuck unclean;
recovery 15/863113 degraded (0.002%); 5/287707 unfound (0.002%); 4
scrub errors
pg 22.ee is stuck unclean for 131473.768406, current state
active+recovering+inconsistent, last acting [45,16,21]
pg 22.ee is active+recovering+inconsistent, acting [45,16,21], 5
unfound
pg 22.4a is active+clean+inconsistent, acting [2,25,34]
recovery 15/863113 degraded (0.002%); 5/287707 unfound (0.002%)
4 scrub errors
Eric
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com