PG inactive, peering

Karun Josy <karunjosy1@xxxxxxxxx> · Tue, 23 Jan 2018 09:05:48 +0530

Hi,
We added a new host to cluster and it was rebalancing.
And one PG became "inactive, peering" for very long time which created lot of slow requests and poor performance to the whole cluster.

When I queried that PG, it showed this :

"recovery_state": [
        {
            "name": "Started/Primary/Peering/GetMissing",
            "enter_time": "2018-01-22 18:40:04.777654",
            "peer_missing_requested": [
                {
                    "osd": "77(7)",

So I assumed it was stuck getting information from osd77 and so I marked osd.77 down.
The status of the PG changed to "active+undersized+degraded" and PG became active again.

Can anyone know why this happened ?
If I start osd.77,again the PG becomes inactive and peering state. 

Is it becase osd.77 is bad ? Or will the same happen when the PG tries to peer again with another disk? 

Any help is really appreciated

Karun 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com