pgs incomplete

☣Adam <adam@xxxxxxxxx> · Tue, 25 Jun 2019 18:56:24 -0500

How can I tell ceph to give up on "incomplete" PGs?

I have 12 pgs which are "inactive, incomplete" that won't recover.  I
think this is because in the past I have carelessly pulled disks too
quickly without letting the system recover.  I suspect the disks that
have the data for these are long gone.

Whatever the reason, I want to fix it so I have a clean cluser even if
that means losing data.

I went through the "troubleshooting pgs" guide[1] which is excellent,
but didn't get me to a fix.

The output of `ceph pg 2.0 query` includes this:
    "recovery_state": [
        {
            "name": "Started/Primary/Peering/Incomplete",
            "enter_time": "2019-06-25 18:35:20.306634",
            "comment": "not enough complete instances of this PG"
        },

I've already restated all OSDs in various orders, and I changed min_size
to 1 to see if that would allow them to get fixed, but no such luck.
These pools are not erasure coded and I'm using the Luminous release.

How can I tell ceph to give up on these PGs?  There's nothing identified
as unfound, so mark_unfound_lost doesn't help.  I feel like `ceph osd
lost` might be it, but at this point the OSD numbers have been reused
for new disks, so I'd really like to limit the damage to the 12 PGs
which are incomplete if possible.

Thanks,
Adam

[1]
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com