"too many PGs per OSD" in Hammer

Chris Armstrong <carmstrong@xxxxxxxxxxxxxx> · Wed, 6 May 2015 14:32:21 -0700

Hi folks,

Calling on the collective Ceph knowledge here. Since upgrading to Hammer, we're now seeing:

     health HEALTH_WARN
            too many PGs per OSD (1536 > max 300)

We have 3 OSDs, so we have used the pg_num of 128 based on the suggestion here: http://ceph.com/docs/master/rados/operations/placement-groups/

We're also using the 12 default pools:
root@ca-deis-1:/# ceph osd lspools
0 rbd,1 data,2 metadata,3 .rgw.root,4 .rgw.control,5 .rgw,6 .rgw.gc,7 .users.uid,8 .users,9 .rgw.buckets.index,10 .rgw.buckets,11 .rgw.buckets.extra,

Here's the output of ceph osd dump:

root@ca-deis-1:/# ceph osd dump
epoch 46
fsid 7bd27c76-f5f8-4eea-819b-379177929653
created 2015-05-06 20:40:01.658764
modified 2015-05-06 21:05:18.391730
flags
pool 0 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 18 flags hashpspool stripe_width 0
pool 1 'data' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 11 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 2 'metadata' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 10 flags hashpspool stripe_width 0
pool 3 '.rgw.root' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 20 flags hashpspool stripe_width 0
pool 4 '.rgw.control' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 22 flags hashpspool stripe_width 0
pool 5 '.rgw' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 24 flags hashpspool stripe_width 0
pool 6 '.rgw.gc' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 25 flags hashpspool stripe_width 0
pool 7 '.users.uid' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 26 flags hashpspool stripe_width 0
pool 8 '.users' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 28 flags hashpspool stripe_width 0
pool 9 '.rgw.buckets.index' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 30 flags hashpspool stripe_width 0
pool 10 '.rgw.buckets' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 35 flags hashpspool stripe_width 0
pool 11 '.rgw.buckets.extra' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 40 flags hashpspool stripe_width 0
max_osd 3
osd.0 up   in  weight 1 up_from 4 up_thru 45 down_at 0 last_clean_interval [0,0) 10.132.162.16:6800/1 10.132.162.16:6801/1 10.132.162.16:6802/1 10.132.162.16:6803/1 exists,up d996b242-7fce-475f-a889-fa14038de180
osd.1 up   in  weight 1 up_from 7 up_thru 45 down_at 0 last_clean_interval [0,0) 10.132.253.121:6800/1 10.132.253.121:6801/1 10.132.253.121:6802/1 10.132.253.121:6803/1 exists,up 8ef7080d-ca37-4003-ae54-b76ddd13f752
osd.2 up   in  weight 1 up_from 45 up_thru 45 down_at 43 last_clean_interval [38,44) 10.132.253.118:6801/1 10.132.253.118:6805/1000001 10.132.253.118:6806/1000001 10.132.253.118:6807/1000001 exists,up 7b30f8aa-732b-4dca-bfbd-2dca9fb3c5ec

Note that we have 3 replicas of our data (size 3) so that we can operate with just one host up. 

We've seen performance issues before (especially during platform start), which has me thinking - are we using too many placement groups given the small number of OSDs and the fact that we're forcing each OSD to have a full set of the data with size=3? Maybe the performance issues are to be expected since we're pushing around so many PGs on startup.

This logic has not changed since our use of firefly and giant, so I'm not sure what changed. Some guidance is appreciated.

Thanks!

Chris
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com