Just making sure this makes the list: Mac Wynkoop ---------- Forwarded message --------- From: 胡 玮文 <huww98@xxxxxxxxxxx> Date: Wed, Oct 7, 2020 at 9:00 PM Subject: Re: pool pgp_num not updated To: Mac Wynkoop <mwynkoop@xxxxxxxxxxxx> Hi, You can read about this behavior at https://ceph.io/rados/new-in-nautilus-pg-merging-and-autotuning/ In short, ceph will not increase pgp_num if misplaced > 5% (by default), and once you got misplaced < 5%, it will increase pgp_num gradually, until reaching the value you set. This 5% can be configured by target_max_misplaced_ratio config option. 在 2020年10月8日,03:22,Mac Wynkoop <mwynkoop@xxxxxxxxxxxx> 写道: Well, backfilling sure, but will it allow me to actually change the pgp_num as more space frees up? Because the issue is that I cannot modify that value. Thanks, Mac Wynkoop, Senior Datacenter Engineer *NetDepot.com:* Cloud Servers; Delivered Houston | Atlanta | NYC | Colorado Springs 1-844-25-CLOUD Ext 806 On Wed, Oct 7, 2020 at 1:50 PM Eugen Block <eblock@xxxxxx> wrote: Yes, I think that’s exactly the reason. As soon as the cluster has more space the backfill will continue. Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>: The cluster is currently in a warn state, here's the scrubbed output of ceph -s: *cluster: id: *redacted* health: HEALTH_WARN noscrub,nodeep-scrub flag(s) set 22 nearfull osd(s) 2 pool(s) nearfull Low space hindering backfill (add storage if this doesn't resolve itself): 277 pgs backfill_toofull Degraded data redundancy: 32652738/3651947772 objects degraded (0.894%), 281 pgs degraded, 341 pgs undersized 1214 pgs not deep-scrubbed in time 2647 pgs not scrubbed in time 2 daemons have recently crashed services: mon: 5 daemons, *redacted* (age 44h) mgr: *redacted* osd: 162 osds: 162 up (since 44h), 162 in (since 4d); 971 remapped pgs flags noscrub,nodeep-scrub rgw: 3 daemons active *redacted* tcmu-runner: 18 daemons active *redacted* data: pools: 10 pools, 2648 pgs objects: 409.56M objects, 738 TiB usage: 1.3 PiB used, 580 TiB / 1.8 PiB avail pgs: 32652738/3651947772 objects degraded (0.894%) 517370913/3651947772 objects misplaced (14.167%) 1677 active+clean 477 active+remapped+backfill_wait 100 active+remapped+backfill_wait+backfill_toofull 80 active+undersized+degraded+remapped+backfill_wait 60 active+undersized+degraded+remapped+backfill_wait+backfill_toofull 42 active+undersized+degraded+remapped+backfill_toofull 33 active+undersized+degraded+remapped+backfilling 25 active+remapped+backfilling 25 active+remapped+backfill_toofull 24 active+undersized+remapped+backfilling 23 active+forced_recovery+undersized+degraded+remapped+backfill_wait 19 active+forced_recovery+undersized+degraded+remapped+backfill_wait+backfill_toofull 15 active+undersized+remapped+backfill_wait 14 active+undersized+remapped+backfill_wait+backfill_toofull 12 active+forced_recovery+undersized+degraded+remapped+backfill_toofull 12 active+forced_recovery+undersized+degraded+remapped+backfilling 5 active+undersized+remapped+backfill_toofull 3 active+remapped 1 active+undersized+remapped 1 active+forced_recovery+undersized+remapped+backfilling io: client: 287 MiB/s rd, 40 MiB/s wr, 1.94k op/s rd, 165 op/s wr recovery: 425 MiB/s, 225 objects/s* Now as you can see, we do have a lot of backfill operations going on at the moment. Does that actually prevent Ceph from modifying the pgp_num value of a pool? Thanks, Mac Wynkoop On Wed, Oct 7, 2020 at 8:57 AM Eugen Block <eblock@xxxxxx> wrote: What is the current cluster status, is it healthy? Maybe increasing pg_num would hit the limit of mon_max_pg_per_osd? Can you share 'ceph -s' output? Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>: Right, both Norman and I set the pg_num before the pgp_num. For example, here is my current pool settings: *"pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7 crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024 pgp_num_target 2048 last_change 8458830 lfor 0/0/8445757 flags hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576 fast_read 1 application rgw"* So, when I set: "*ceph osd pool set hou-ec-1.rgw.buckets.data pgp_num 2048*" it returns: "*set pool 40 pgp_num to 2048*" But upon checking the pool details again: "*pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7 crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024 pgp_num_target 2048 last_change 8458870 lfor 0/0/8445757 flags hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576 fast_read 1 application rgw*" and the pgp_num value does not increase. Am I just doing something totally wrong? Thanks, Mac Wynkoop On Tue, Oct 6, 2020 at 2:32 PM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote: pg_num and pgp_num need to be the same, not? 3.5.1. Set the Number of PGs To set the number of placement groups in a pool, you must specify the number of placement groups at the time you create the pool. See Create a Pool for details. Once you set placement groups for a pool, you can increase the number of placement groups (but you cannot decrease the number of placement groups). To increase the number of placement groups, execute the following: ceph osd pool set {pool-name} pg_num {pg_num} Once you increase the number of placement groups, you must also increase the number of placement groups for placement (pgp_num) before your cluster will rebalance. The pgp_num should be equal to the pg_num. To increase the number of placement groups for placement, execute the following: ceph osd pool set {pool-name} pgp_num {pgp_num} https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/storage_strategies_guide/placement_groups_pgs -----Original Message----- To: norman Cc: ceph-users Subject: Re: pool pgp_num not updated Hi everyone, I'm seeing a similar issue here. Any ideas on this? Mac Wynkoop, On Sun, Sep 6, 2020 at 11:09 PM norman <norman.kern@xxxxxxx> wrote: Hi guys, When I update the pg_num of a pool, I found it not worked(no rebalanced), anyone know the reason? Pool's info: pool 21 'openstack-volumes-rs' replicated size 3 min_size 2 crush_rule 21 object_hash rjenkins pg_num 1024 pgp_num 512 pgp_num_target 1024 autoscale_mode warn last_change 85103 lfor 82044/82044/82044 flags hashpspool,nodelete,selfmanaged_snaps stripe_width 0 application rbd removed_snaps [1~1e6,1e8~300,4e9~18,502~3f,542~11,554~1a,56f~1d7] pool 22 'openstack-vms-rs' replicated size 3 min_size 2 crush_rule 22 object_hash rjenkins pg_num 512 pgp_num 512 pg_num_target 256 pgp_num_target 256 autoscale_mode warn last_change 84769 lfor 0/0/55294 flags hashpspool,nodelete,selfmanaged_snaps stripe_width 0 application rbd The pgp_num_target is set, but pgp_num not set. I have scale out new OSDs and is backfilling before setting the value, is it the reason? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx