Hi all,
I am testing the ceph right now using 4 servers with 8 OSDs (all OSDs are up and in). I have 3 pools in my cluster (image pool, volume pool and default rbd pool), both image and volume pool have replication size =3. Based on the pg equation, there are 448 pgs in my cluster.
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 16.07797 root default
-5 14.38599 rack rack1
-2 7.17599 host psusnjhhdlc7iosstb001
0 3.53899 osd.0 up 1.00000 1.00000
1 3.63699 osd.1 up 1.00000 1.00000
-3 7.20999 host psusnjhhdlc7iosstb002
2 3.63699 osd.2 up 1.00000 1.00000
3 3.57300 osd.3 up 1.00000 1.00000
-6 1.69199 rack rack2
-4 0.83600 host psusnjhhdlc7iosstb003
5 0.43500 osd.5 up 1.00000 1.00000
4 0.40099 osd.4 up 1.00000 1.00000
-7 0.85599 host psusnjhhdlc7iosstb004
6 0.40099 osd.6 up 1.00000 0
7 0.45499 osd.7 up 1.00000 0
$ ceph osd dump
pool 0 'rbd' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 745 flags hashpspool stripe_width 0
pool 3 'imagesliberty' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 777 flags hashpspool stripe_width 0
removed_snaps [1~1,8~c]
pool 4 'volumesliberty' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 776 flags hashpspool stripe_width 0
removed_snaps [1~1,15~14,2a~1,2c~1,2e~24,57~2,5a~18,74~2,78~1,94~5,b7~2]
Right now, the ceph health is HEALTH_WARN. I use "ceph health detail" to dump the information, and there is a pg stuck.
$ ceph -s:
cluster 2e906379-f211-4329-8faf-a8e7600b8418
health HEALTH_WARN
1 pgs degraded
1 pgs stuck degraded
1 pgs stuck inactive
1 pgs stuck unclean
1 pgs stuck undersized
1 pgs undersized
recovery 23/55329 objects degraded (0.042%)
monmap e14: 2 mons at {psusnjhhdlc7ioscom002=192.168.2.62:6789/0,psusnjhhdlc7ioscon002=192.168.2.12:6789/0}
election epoch 106, quorum 0,1 psusnjhhdlc7ioscon002,psusnjhhdlc7ioscom002
osdmap e776: 8 osds: 8 up, 8 in
flags sortbitwise
pgmap v519644: 448 pgs, 3 pools, 51541 MB data, 18443 objects
170 GB used, 16294 GB / 16464 GB avail
23/55329 objects degraded (0.042%)
447 active+clean
1 undersized+degraded+peered
$ ceph health detail
HEALTH_WARN 1 pgs degraded; 1 pgs stuck unclean; 1 pgs undersized; recovery 23/55329 objects degraded (0.042%)
pg 3.d is stuck unclean for 58161.177025, current state active+undersized+degraded, last acting [1,3]
pg 3.d is active+undersized+degraded, acting [1,3]
recovery 23/55329 objects degraded (0.042%)
If I am right, the pg 3.d has only 2 replicas, primary in OSD.1 and secondary in OSD.3. There is no 3rd replica in the cluster. That's why it gives the unhealthy warning.
I tried to decrease the replication size =2 for image pool and the stuck pg disappeared. After I change the size back to 3, still the ceph didn't create the 3rd replica for pg 3.d.
I also tried to shutdown Server 0 which has OSD.0 and OSD.1 which let pg d.3 has only 1 replica in the cluster. Still it didn't create another copy even I set size =3 and min_size=2. Also, there are more pg in degraded undersized or unclean mode.
$ ceph pg map 3.d
osdmap e796 pg 3.d (3.d) -> up [3] acting [3]
$ ceph -s
cluster 2e906379-f211-4329-8faf-a8e7600b8418
health HEALTH_WARN
16 pgs degraded
16 pgs stuck degraded
2 pgs stuck inactive
37 pgs stuck unclean
16 pgs stuck undersized
16 pgs undersized
recovery 1427/55329 objects degraded (2.579%)
recovery 780/55329 objects misplaced (1.410%)
monmap e14: 2 mons at {psusnjhhdlc7ioscom002=192.168.2.62:6789/0,psusnjhhdlc7ioscon002=192.168.2.12:6789/0}
election epoch 106, quorum 0,1 psusnjhhdlc7ioscon002,psusnjhhdlc7ioscom002
osdmap e796: 8 osds: 6 up, 6 in; 21 remapped pgs
flags sortbitwise
pgmap v521445: 448 pgs, 3 pools, 51541 MB data, 18443 objects
168 GB used, 8947 GB / 9116 GB avail
1427/55329 objects degraded (2.579%)
780/55329 objects misplaced (1.410%)
411 active+clean
21 active+remapped
14 active+undersized+degraded
2 undersized+degraded+peered
Can anyone advise how fix pg 3.d problem and why ceph couldn't recover if I shutdown one server (2 OSDs)
Thanks
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com