Re: pg stuck in remapped+peering for a long time

Peter Theobald <pete@xxxxxxxxxxxxxxx> · Sun, 15 Nov 2015 01:26:27 +0000

Hi Gregory,
This is the output of ceph -s
    cluster 5400bbc9-378d-4c69-afc4-da71393f7baf
     health HEALTH_WARN
            82 pgs peering
            82 pgs stuck inactive
            82 pgs stuck unclean
            1 requests are blocked > 32 sec
            pool images pg_num 256 > pgp_num 128
     monmap e2: 2 mons at {0=192.168.2.1:6789/0,1=192.168.2.3:6789/0}
            election epoch 16, quorum 0,1 0,1
     osdmap e168004: 9 osds: 9 up, 9 in; 4 remapped pgs
      pgmap v1317963: 256 pgs, 1 pools, 4377 GB data, 1105 kobjects
            8792 GB used, 15369 GB / 24162 GB avail
                 174 active+clean
                  78 peering
                   4 remapped+peering

Total available space is about 24TB. Used space is 8TB at replication level of 2,

Regards
Pete

On 14 November 2015 at 18:03, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
What's the full output of "Ceph -s"? Are your new crush rules actually satisfiable? Is your cluster filling up?-Greg

On Saturday, November 14, 2015, Peter Theobald <pete@xxxxxxxxxxxxxxx> wrote:
Hi list,

I have a 3 node ceph cluster with a total of 9 ods (2,3 and 4 with different size drives). I changed the layout (failure domain from per osd to per host and changed min_size) and I now have a few pgs stuck in peering or remapped+peering for a couple of day now.

The hosts are under powered. 2x hp microservers and a single i5 desktop grade machine so not super powerful. The network is fast though (bonded gb ethernet with dedicated switch).

I'm concerned that the remapped+peering pgs are stuck. All the nodes in peering or remapped+peering are stuck inactive and unclean so i'm concerned about data loss. Do I just need to wait for them to fix themselves? I cannot see any mention of unfound objects when I query the remapped pgs so I think i'm ok and just need to be patient. I have 128 pgs across 9 osds so probably have a lot of objects per pg. Total data is about 4TB

Regards

Pete

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com