Hi all, I'm working on algorithm to estimate PG count for set of pools with minimal input from user. The main target is openstack deployments. I know about ceph.com/pgcalc/, but would like to write down a rules and get a python code. Can you comment following, please? Input: * pg_count has no influence on performance except for very small values * OSD requires a little RAM and CPU to serve each PG * More PG makes rebalance more smooth * More PG makes peering longer * 300 PG per OSD is a good default upper bound Rules: * PG per OSD shold be less than 300 * No pool should gets less PG, that some preselected amount (64 for now), also each pool shold have not less that one PG copy on each OSD * As cluster may grows - algorithm should takes value near to upper bound * PG count for a pool should be proportional to data size in the pool. Algorithm: * Estimated total amount of PG copies calculated as (OSD * PG_COPY_PER_OSD), where PG_COPY_PER_OSD == 200 for now * Each small pool get one PG copy per OSD. Means (OSD / pool_sz) groups * All the rest PG are devided between rest pools, proportional to their weights (but no pool should get less PG than minimal count). By default next weights are used::: volumes - 16 compute - 8 backups - 4 .rgw - 4 images - 1 * Each PG count is rounded to next power of 2 Thanks -- Kostiantyn Danilov aka koder.ua Principal software engineer, Mirantis skype:koder.ua http://koder-ua.blogspot.com/ http://mirantis.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com