Re: Algorithm for default pg_count calculation

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Mon, 27 Jul 2015 10:39:36 -0600

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Some more info:

Small PG count:
* More varied data distribution in cluster

Large PG count:
* More even data distribution in cluster
* A very high number of PG can starve CPU/RAM causing performance to decrease

We are targeting 50 PGs per OSD to keep resources low and because we
can handle very varied OSD utilization distributions for now. We will
tweak things as we get more experience.
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVtl7ECRDmVDuy+mK58QAAhGkQAIgW+4UN0KPVMexRGTI0
8045UF2+xcVJZTPG1i9MACGNdj2za99oaaR4TlUCMX+Ik/ZpPnIzzYdmylrV
KHWCObH1PBDjoDotUc0JJcggFr2J8bLbMWPjZHQsmRLQMjlOamdPT8UtcyZ+
HOuI7plga2Qg6ml6DI7oTXnHbCB3xd5/U++VLjKb+B3cNW7G01uuHD9ypgV8
tIBW/xbV/gf3hB34pi7aMFDbaISmwmLTH5HRfFjF64RLKEkKNnMcyKlKNVfa
cq2PyX6nZnJsd8w0P7irThhNeaBhrPLIdyzyaYQUNuxwD/DKyd03xJZSU+5L
HPhUCVHlKLHK7mpTnZZWSWUSTP1q7gVhT9sYAzEJx2QSPhMxzK8ALe/oldlt
3MVkxksNwwjU1XgxUzUQ2A5y9+xe6VbN4slt/Og1J3zuWXrb55aZU5Wrzscb
CJH8hErvab3t2uLAr6GSNcS+k9f2v2ztDjKZdisw2O0OPU2xp5VvpyZFVLO0
8F0E26iXm/dD+cRwMpHek1FncBUTveEIIl7imsC/AjBkwwUcYc/hcDtclIX9
hQ5wOBdFCCHc0gcV5x4M3ogb4sjn0CWSnsB3aMHxRtX/p3/cCuRMR0eXL44H
Or2sbJRwwM2+9kItGOskYjZLX/ELdWV+fpmy/BeJB92U+1pxOWBB2Wv11SY0
uBX2
=zbAW
-----END PGP SIGNATURE-----
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Mon, Jul 27, 2015 at 8:22 AM, Konstantin Danilov
<kdanilov@xxxxxxxxxxxx> wrote:
> Hi all,
>
> I'm working on algorithm to estimate PG count for set of pools
> with minimal input from user. The main target is openstack deployments.
> I know about ceph.com/pgcalc/, but would like to write down a rules and
> get a python code.
>
> Can you comment following, please?
>
> Input:
> * pg_count has no influence on performance except for very small values
> * OSD requires a little RAM and CPU to serve each PG
> * More PG makes rebalance more smooth
> * More PG makes peering longer
> * 300 PG per OSD is a good default upper bound
>
> Rules:
>
> * PG per OSD shold be less than 300
> * No pool should gets less PG, that some preselected amount (64 for now),
>   also each pool shold have not less that one PG copy on each OSD
> * As cluster may grows - algorithm should takes value near to upper bound
> * PG count for a pool should be proportional to data size in the pool.
>
> Algorithm:
>
> * Estimated total amount of PG copies calculated as (OSD * PG_COPY_PER_OSD),
>   where PG_COPY_PER_OSD == 200 for now
> * Each small pool get one PG copy per OSD. Means (OSD / pool_sz) groups
> * All the rest PG are devided between rest pools, proportional to their
>   weights (but no pool should get less PG than minimal count).
>
>   By default next weights are used:::
>
>     volumes - 16
>     compute - 8
>     backups - 4
>     .rgw - 4
>     images - 1
>
> * Each PG count is rounded to next power of 2
>
> Thanks
>
> --
> Kostiantyn Danilov aka koder.ua
> Principal software engineer, Mirantis
>
> skype:koder.ua
> http://koder-ua.blogspot.com/
> http://mirantis.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com