> Its the same formula. An k-times replicated pool has replication factor R. > With the formula I stated below, you can compute the entire PG budget depending > on what your PG target per OSD is. I'm afraid you will have to do that yourself. Sorry, I meant a k-times replicated pool has replication factor R=k. ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Frank Schilder <frans@xxxxxx> Sent: 18 November 2020 09:25:46 To: Szabo, Istvan (Agoda); ceph-users@xxxxxxx Subject: Re: Ceph EC PG calculation Its the same formula. An k-times replicated pool has replication factor R. With the formula I stated below, you can compute the entire PG budget depending on what your PG target per OSD is. I'm afraid you will have to do that yourself. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> Sent: 18 November 2020 09:21:50 To: Frank Schilder; ceph-users@xxxxxxx Subject: RE: Ceph EC PG calculation Hi, Thank you Frank. And after how this affect the non EC pools? Because they will use the same device classes, which is SSD. So I'd calculate with 100PG/osd, because this will grow. If I calculate with EC it will be 512. But still have many replicated pools 😊 Or just let the autoscaler in warn and do when it instruct. To be honest I just want to be sure my setup is correct or I miss something or did something wrong. -----Original Message----- From: Frank Schilder <frans@xxxxxx> Sent: Wednesday, November 18, 2020 3:11 PM To: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>; ceph-users@xxxxxxx Subject: Re: Ceph EC PG calculation Email received from outside the company. If in doubt don't click links nor open attachments! ________________________________ Roughly speaking, if you have N OSDs, a replication factor of R and aim for P PGs/OSD on average, you can assign (N*P)/R PGs to the pool. Example: 4+2 EC has replication 6. There are 36 OSDs. If you want to place, say, 50 PGs per OSD, you can assign (36*50)/6=300 PGs to the EC pool. You may pick a close power of 2 if you wish and then calculate how many PGs will be placed on each OSD on average. For example, we choose 256 PGs, then 256*6/36 = 42.7 PGs per OSD will be added. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> Sent: 18 November 2020 04:58:38 To: ceph-users@xxxxxxx Subject: Ceph EC PG calculation Hi, I have this error: I have 36 osd and get this: Error ERANGE: pg_num 4096 size 6 would mean 25011 total pgs, which exceeds max 10500 (mon_max_pg_per_osd 250 * num_in_osds 42) If I want to calculate the max pg in my server, how it works if I have EC pool? I have 4:2 data EC pool, and the others are replicated. These are the pools: pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode warn last_change 597 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 598 flags hashpspool stripe_width 0 application rgw pool 6 'sin.rgw.log' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 599 flags hashpspool stripe_width 0 application rgw pool 7 'sin.rgw.control' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 600 flags hashpspool stripe_width 0 application rgw pool 8 'sin.rgw.meta' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 601 lfor 0/393/391 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 8 application rgw pool 10 'sin.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 602 lfor 0/529/527 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 8 application rgw pool 11 'sin.rgw.buckets.data.old' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 603 flags hashpspool stripe_width 0 application rgw pool 12 'sin.rgw.buckets.data' erasure profile data-ec size 6 min_size 5 crush_rule 3 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 604 flags hashpspool,ec_overwrites stripe_width 16384 application rgw So how I can calculate the pgs? This is my osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 534.38354 root default -5 89.06392 host cephosd-6s01 36 nvme 1.74660 osd.36 up 1.00000 1.00000 0 ssd 14.55289 osd.0 up 1.00000 1.00000 8 ssd 14.55289 osd.8 up 1.00000 1.00000 15 ssd 14.55289 osd.15 up 1.00000 1.00000 18 ssd 14.55289 osd.18 up 1.00000 1.00000 24 ssd 14.55289 osd.24 up 1.00000 1.00000 30 ssd 14.55289 osd.30 up 1.00000 1.00000 -3 89.06392 host cephosd-6s02 37 nvme 1.74660 osd.37 up 1.00000 1.00000 1 ssd 14.55289 osd.1 up 1.00000 1.00000 11 ssd 14.55289 osd.11 up 1.00000 1.00000 17 ssd 14.55289 osd.17 up 1.00000 1.00000 23 ssd 14.55289 osd.23 up 1.00000 1.00000 28 ssd 14.55289 osd.28 up 1.00000 1.00000 35 ssd 14.55289 osd.35 up 1.00000 1.00000 -11 89.06392 host cephosd-6s03 41 nvme 1.74660 osd.41 up 1.00000 1.00000 2 ssd 14.55289 osd.2 up 1.00000 1.00000 6 ssd 14.55289 osd.6 up 1.00000 1.00000 13 ssd 14.55289 osd.13 up 1.00000 1.00000 19 ssd 14.55289 osd.19 up 1.00000 1.00000 26 ssd 14.55289 osd.26 up 1.00000 1.00000 32 ssd 14.55289 osd.32 up 1.00000 1.00000 -13 89.06392 host cephosd-6s04 38 nvme 1.74660 osd.38 up 1.00000 1.00000 5 ssd 14.55289 osd.5 up 1.00000 1.00000 7 ssd 14.55289 osd.7 up 1.00000 1.00000 14 ssd 14.55289 osd.14 up 1.00000 1.00000 20 ssd 14.55289 osd.20 up 1.00000 1.00000 25 ssd 14.55289 osd.25 up 1.00000 1.00000 31 ssd 14.55289 osd.31 up 1.00000 1.00000 -9 89.06392 host cephosd-6s05 40 nvme 1.74660 osd.40 up 1.00000 1.00000 3 ssd 14.55289 osd.3 up 1.00000 1.00000 10 ssd 14.55289 osd.10 up 1.00000 1.00000 12 ssd 14.55289 osd.12 up 1.00000 1.00000 21 ssd 14.55289 osd.21 up 1.00000 1.00000 29 ssd 14.55289 osd.29 up 1.00000 1.00000 33 ssd 14.55289 osd.33 up 1.00000 1.00000 -7 89.06392 host cephosd-6s06 39 nvme 1.74660 osd.39 up 1.00000 1.00000 4 ssd 14.55289 osd.4 up 1.00000 1.00000 9 ssd 14.55289 osd.9 up 1.00000 1.00000 16 ssd 14.55289 osd.16 up 1.00000 1.00000 22 ssd 14.55289 osd.22 up 1.00000 1.00000 27 ssd 14.55289 osd.27 up 1.00000 1.00000 34 ssd 14.55289 osd.34 up 1.00000 1.00000 This is the crush rules: [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 1, "rule_name": "replicated_nvme", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -21, "item_name": "default~nvme" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 2, "rule_name": "replicated_ssd", "ruleset": 2, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -2, "item_name": "default~ssd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 3, "rule_name": "sin.rgw.buckets.data.new", "ruleset": 3, "type": 3, "min_size": 3, "max_size": 6, "steps": [ { "op": "set_chooseleaf_tries", "num": 5 }, { "op": "set_choose_tries", "num": 100 }, { "op": "take", "item": -2, "item_name": "default~ssd" }, { "op": "chooseleaf_indep", "num": 0, "type": "host" }, { "op": "emit" } ] } ] So everything else rather than the data pool are on SSD and nvme with replica 3. If I calculate the pg in the ec like 36osd*100/6=600 which means the max pg in the EC pool is 512? But how this affect the SSD replica pools then? This is the EC pool definition: crush-device-class=ssd crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=4 m=2 plugin=jerasure technique=reed_sol_van w=8 Thank you in advance. ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx