Algorithm for default pg_count calculation

Konstantin Danilov <kdanilov@xxxxxxxxxxxx> · Mon, 27 Jul 2015 17:22:19 +0300

Hi all,

I'm working on algorithm to estimate PG count for set of pools
with minimal input from user. The main target is openstack deployments.
I know about ceph.com/pgcalc/, but would like to write down a rules and
get a python code.

Can you comment following, please?

Input:
* pg_count has no influence on performance except for very small values
* OSD requires a little RAM and CPU to serve each PG
* More PG makes rebalance more smooth
* More PG makes peering longer
* 300 PG per OSD is a good default upper bound

Rules:

* PG per OSD shold be less than 300
* No pool should gets less PG, that some preselected amount (64 for now),
  also each pool shold have not less that one PG copy on each OSD
* As cluster may grows - algorithm should takes value near to upper bound
* PG count for a pool should be proportional to data size in the pool.

Algorithm:

* Estimated total amount of PG copies calculated as (OSD * PG_COPY_PER_OSD),
  where PG_COPY_PER_OSD == 200 for now
* Each small pool get one PG copy per OSD. Means (OSD / pool_sz) groups
* All the rest PG are devided between rest pools, proportional to their
  weights (but no pool should get less PG than minimal count).

  By default next weights are used:::

    volumes - 16
    compute - 8
    backups - 4
    .rgw - 4
    images - 1

* Each PG count is rounded to next power of 2

Thanks

-- 
Kostiantyn Danilov aka koder.ua
Principal software engineer, Mirantis

skype:koder.ua
http://koder-ua.blogspot.com/
http://mirantis.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com