Re: How to fix a Ceph PG in unkown state with no OSDs?

Gregory Farnum <gfarnum@xxxxxxxxxx> · Thu, 14 Jun 2018 13:59:01 -0400

Is this a new cluster? Or did the crush map change somehow recently? One way this might happen is if CRUSH just failed entirely to map a pg, although I think if the pg exists anywhere it should still be getting reported as inactive.
On Thu, Jun 14, 2018 at 8:40 AM Oliver Schulz <oliver.schulz@xxxxxxxxxxxxxx> wrote:
Dear all,

I have a serious problem with our Ceph cluster: One of our PGs somehow

ended up in this state (reported by "ceph health detail":

     pg 1.XXX is stuck inactive for ..., current state unknown, last acting []

Also, "ceph pg map 1.xxx" reports:

     osdmap e525812 pg 1.721 (1.721) -> up [] acting []

I can't use "ceph pg 1.XXX query", it just hangs with no output.

All OSDs are up and in, I have MON quorum, all other PGs seem to be fine.

How can diagnose/fix this? Unfortunately, the PG in question is part

of the CephFS metadata pool ...

Any help would be very, very much appreciated!

Cheers,

Oliver

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com