Hi, I am learning Ceph and I am having a hard time understanding PG and PG calculus . I know that a PG is a collection of objects, and that PG are replicated over the hosts to respect the replication size, but... In traditional storage, we use size in Gb, Tb and so on, we create a pool from a bunch of disks or raid arrays of some size then we create volumes of a certain size and use them. If the storage is full we add disks, then we extend our pools/volumes. The idea of size is simple to understand. Ceph, although it supports the notion of pool size in Gb, Tb ...etc. Pools are created using PGs, and now there is also the notion of % of data. When I use pg calc from ceph or from redhat, the generated yml file contains the % variable, but the commands file contains only the PGs, and when you are configuring 15% and 18% have the same number of PGs !!!!!!!!!!!!??? The pg calc encourages you to create a %data multiple of 100, in other words, it assumes that you know all your pools from the start. What if you won't consume all your raw disk space. What happens when you need to add a new pool? Also when you create several pools, and then execute ceph osd df tree, you can see that all pools show the raw size as a free space, it is like all pools share the same raw space regardless of their PG number. If someone can put some light on this concept and how to manage it wisely, because the documentation keeps saying that it's an important concept, that you have to pay attention when choosing the number of PGs for a pool from the start. Regards. <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx