Hi Wodel, The simple explanation is that PGs are a level of storage abstraction above the drives (OSD) and below objects (pools). The links below may be helpful. PGs consume resources, so they should be planned as best you can. Now you can scale them up and down, and use autoscaler, so you don't have to be spot on right away. PGs peer up and replicate data according to your chosen CRUSH rules. https://ceph.io/en/news/blog/2014/how-data-is-stored-in-ceph-cluster/ https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/1.3/html/storage_strategies_guide/placement_groups_pgs https://www.sebastien-han.fr/blog/2012/10/15/ceph-data-placement/ -- Alex Gorbachev ISS Storcium On Tue, Apr 25, 2023 at 6:10 PM wodel youchi <wodel.youchi@xxxxxxxxx> wrote: > Hi, > > I am learning Ceph and I am having a hard time understanding PG and PG > calculus . > > I know that a PG is a collection of objects, and that PG are replicated > over the hosts to respect the replication size, but... > > In traditional storage, we use size in Gb, Tb and so on, we create a pool > from a bunch of disks or raid arrays of some size then we create volumes of > a certain size and use them. If the storage is full we add disks, then we > extend our pools/volumes. > The idea of size is simple to understand. > > Ceph, although it supports the notion of pool size in Gb, Tb ...etc. Pools > are created using PGs, and now there is also the notion of % of data. > > When I use pg calc from ceph or from redhat, the generated yml file > contains the % variable, but the commands file contains only the PGs, and > when you are configuring 15% and 18% have the same number of PGs > !!!!!!!!!!!!??? > > The pg calc encourages you to create a %data multiple of 100, in other > words, it assumes that you know all your pools from the start. What if you > won't consume all your raw disk space. > What happens when you need to add a new pool? > > Also when you create several pools, and then execute ceph osd df tree, you > can see that all pools show the raw size as a free space, it is like all > pools share the same raw space regardless of their PG number. > > If someone can put some light on this concept and how to manage it wisely, > because the documentation keeps saying that it's an important concept, that > you have to pay attention when choosing the number of PGs for a pool from > the start. > > Regards. > > < > https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail > > > Virus-free.www.avast.com > < > https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail > > > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx