Re: 1 pg stuck

施柏安 <desmond.s@xxxxxxxxxxxxxx> · Fri, 25 Mar 2016 09:39:07 +0800

If Ceph cluster stuck in recovery state?
Did you try command "ceph pg repair <pg-id>" or "ceph pg <pg-id> query" to trace its state?

2016-03-24 22:36 GMT+08:00 yang sheng <forsaks.30@xxxxxxxxx>:
Hi all,
I am testing the ceph right now using 4 servers with 8 OSDs (all OSDs are up and in). I have 3 pools in my cluster (image pool, volume pool and default rbd pool), both image and volume pool have replication size =3. Based on the pg equation, there are 448 pgs in my cluster. 

$ ceph osd tree
ID WEIGHT   TYPE NAME                          UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 16.07797 root default                                                         
-5 14.38599     rack rack1                                              
-2  7.17599         host psusnjhhdlc7iosstb001                                   
 0  3.53899             osd.0                       up  1.00000          1.00000 
 1  3.63699             osd.1                       up  1.00000          1.00000 
-3  7.20999         host psusnjhhdlc7iosstb002                                   
 2  3.63699             osd.2                       up  1.00000          1.00000 
 3  3.57300             osd.3                       up  1.00000          1.00000 
-6  1.69199     rack rack2                                              
-4  0.83600         host psusnjhhdlc7iosstb003                                   
 5  0.43500             osd.5                       up  1.00000          1.00000 
 4  0.40099             osd.4                       up  1.00000          1.00000 
-7  0.85599         host psusnjhhdlc7iosstb004                                   
 6  0.40099             osd.6                       up  1.00000                0 
 7  0.45499             osd.7                       up  1.00000                0 

$ ceph osd dump
pool 0 'rbd' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 745 flags hashpspool stripe_width 0
pool 3 'imagesliberty' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 777 flags hashpspool stripe_width 0
	removed_snaps [1~1,8~c]
pool 4 'volumesliberty' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 776 flags hashpspool stripe_width 0
	removed_snaps [1~1,15~14,2a~1,2c~1,2e~24,57~2,5a~18,74~2,78~1,94~5,b7~2]

Right now, the ceph health is HEALTH_WARN. I use "ceph health detail"  to dump the information, and there is a pg stuck.

$ ceph -s:
cluster 2e906379-f211-4329-8faf-a8e7600b8418
     health HEALTH_WARN
            1 pgs degraded
            1 pgs stuck degraded
            1 pgs stuck inactive
            1 pgs stuck unclean
            1 pgs stuck undersized
            1 pgs undersized
            recovery 23/55329 objects degraded (0.042%)
     monmap e14: 2 mons at {psusnjhhdlc7ioscom002=192.168.2.62:6789/0,psusnjhhdlc7ioscon002=192.168.2.12:6789/0}
            election epoch 106, quorum 0,1 psusnjhhdlc7ioscon002,psusnjhhdlc7ioscom002
     osdmap e776: 8 osds: 8 up, 8 in
            flags sortbitwise
      pgmap v519644: 448 pgs, 3 pools, 51541 MB data, 18443 objects
            170 GB used, 16294 GB / 16464 GB avail
            23/55329 objects degraded (0.042%)
                 447 active+clean
                   1 undersized+degraded+peered

$ ceph health detail
HEALTH_WARN 1 pgs degraded; 1 pgs stuck unclean; 1 pgs undersized; recovery 23/55329 objects degraded (0.042%)
pg 3.d is stuck unclean for 58161.177025, current state active+undersized+degraded, last acting [1,3]
pg 3.d is active+undersized+degraded, acting [1,3]
recovery 23/55329 objects degraded (0.042%)

If I am right, the pg 3.d has only 2 replicas, primary in OSD.1 and secondary in OSD.3. There is no 3rd replica in the cluster. That's why it gives the unhealthy warning.  

I tried to decrease the replication size =2 for image pool and the stuck pg disappeared. After I change the size back to 3, still the ceph didn't create the 3rd replica for pg 3.d.

I also tried to shutdown Server 0 which has OSD.0 and OSD.1 which let pg d.3 has only 1 replica in the cluster. Still it didn't create another copy even I set size =3 and min_size=2. Also, there are more pg in degraded undersized or unclean mode.

$ ceph pg map 3.d
osdmap e796 pg 3.d (3.d) -> up [3] acting [3]

$ ceph -s
    cluster 2e906379-f211-4329-8faf-a8e7600b8418
     health HEALTH_WARN
            16 pgs degraded
            16 pgs stuck degraded
            2 pgs stuck inactive
            37 pgs stuck unclean
            16 pgs stuck undersized
            16 pgs undersized
            recovery 1427/55329 objects degraded (2.579%)
            recovery 780/55329 objects misplaced (1.410%)
     monmap e14: 2 mons at {psusnjhhdlc7ioscom002=192.168.2.62:6789/0,psusnjhhdlc7ioscon002=192.168.2.12:6789/0}
            election epoch 106, quorum 0,1 psusnjhhdlc7ioscon002,psusnjhhdlc7ioscom002
     osdmap e796: 8 osds: 6 up, 6 in; 21 remapped pgs
            flags sortbitwise
      pgmap v521445: 448 pgs, 3 pools, 51541 MB data, 18443 objects
            168 GB used, 8947 GB / 9116 GB avail
            1427/55329 objects degraded (2.579%)
            780/55329 objects misplaced (1.410%)
                 411 active+clean
                  21 active+remapped
                  14 active+undersized+degraded
                   2 undersized+degraded+peered

Can anyone advise how fix pg 3.d problem and why ceph couldn't recover if I shutdown one server (2 OSDs)

Thanks

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com