Re: pg_num != pgp_num - and unable to change.

Dan van der Ster <dan.vanderster@xxxxxxxxx> · Thu, 6 Jul 2023 15:01:23 -0700

Hi Jesper,

> In earlier versions of ceph (without autoscaler) I have only experienced
> that setting pg_num and pgp_num took immidiate effect?

That's correct -- in recent Ceph (since nautilus) you cannot manipulate
pgp_num directly anymore. There is a backdoor setting (set pgp_num_actual
...) but I don't really recommend that.

Since nautilus, pgp_num (and pg_num) will be increased by the mgr
automatically to reach your pg_num_target over time. (If you're a source
code reader check DaemonServer::adjust_pgs for how this works).

In short, the mgr is throttled by the target_max_misplaced_ratio, which
defaults to 5%.

So if you want to split more aggressively,
increase target_max_misplaced_ratio.

Cheers, Dan

______________________________________________________
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com

On Wed, Jul 5, 2023 at 9:41 PM Jesper Krogh <jesper@xxxxxxxx> wrote:

> Hi.
>
> Fresh cluster - after a dance where the autoscaler did not work
> (returned black) as described in the doc - I now seemingly have it
> working. It has bumpted target to something reasonable -- and is slowly
> incrementing pg_num and pgp_num by 2 over time (hope this is correct?)
>
> But .
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool ls detail | grep 62
> pool 22 'cephfs.archive.ec62data' erasure profile ecprof62 size 8
> min_size 7 crush_rule 3 object_hash rjenkins pg_num 150 pgp_num 22
> pg_num_target 512 pgp_num_target 512 autoscale_mode on last_change 9159
> lfor 0/0/9147 flags hashpspool,ec_overwrites,selfmanaged_snaps,bulk
> stripe_width 24576 pg_num_min 128 target_size_ratio 0.4 application
> cephfs
>
> pg_num = 150
> pgp_num = 22
>
> and setting pgp_num seemingly have zero effect on the system .. not even
> with autoscaling set to off.
>
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_autoscale_mode off
> set pool 22 pg_autoscale_mode to off
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pgp_num 150
> set pool 22 pgp_num to 150
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_num_min 128
> set pool 22 pg_num_min to 128
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_num 150
> set pool 22 pg_num to 150
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_autoscale_mode on
> set pool 22 pg_autoscale_mode to on
> jskr@dkcphhpcmgt028:/$ sudo ceph progress
> PG autoscaler increasing pool 22 PGs from 150 to 512 (14s)
>      [............................]
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool ls detail | grep 62
> pool 22 'cephfs.archive.ec62data' erasure profile ecprof62 size 8
> min_size 7 crush_rule 3 object_hash rjenkins pg_num 150 pgp_num 22
> pg_num_target 512 pgp_num_target 512 autoscale_mode on last_change 9159
> lfor 0/0/9147 flags hashpspool,ec_overwrites,selfmanaged_snaps,bulk
> stripe_width 24576 pg_num_min 128 target_size_ratio 0.4 application
> cephfs
>
> pgp_num != pg_num ?
>
> In earlier versions of ceph (without autoscaler) I have only experienced
> that setting pg_num and pgp_num took immidiate effect?
>
> Jesper
>
> jskr@dkcphhpcmgt028:/$ sudo ceph version
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
> (stable)
> jskr@dkcphhpcmgt028:/$ sudo ceph health
> HEALTH_OK
> jskr@dkcphhpcmgt028:/$ sudo ceph status
>    cluster:
>      id:     5c384430-da91-11ed-af9c-c780a5227aff
>      health: HEALTH_OK
>
>    services:
>      mon: 3 daemons, quorum dkcphhpcmgt031,dkcphhpcmgt029,dkcphhpcmgt028
> (age 15h)
>      mgr: dkcphhpcmgt031.afbgjx(active, since 32h), standbys:
> dkcphhpcmgt029.bnsegi, dkcphhpcmgt028.bxxkqd
>      mds: 2/2 daemons up, 1 standby
>      osd: 40 osds: 40 up (since 44h), 40 in (since 39h); 33 remapped pgs
>
>    data:
>      volumes: 2/2 healthy
>      pools:   9 pools, 495 pgs
>      objects: 24.85M objects, 60 TiB
>      usage:   117 TiB used, 158 TiB / 276 TiB avail
>      pgs:     13494029/145763897 objects misplaced (9.257%)
>               462 active+clean
>               23  active+remapped+backfilling
>               10  active+remapped+backfill_wait
>
>    io:
>      client:   0 B/s rd, 1.1 MiB/s wr, 0 op/s rd, 94 op/s wr
>      recovery: 705 MiB/s, 208 objects/s
>
>    progress:
>
>
> --
> Jesper Krogh
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx