Re: Does CEPH limit the pgp_num which it will increase in one go?

Dan van der Ster <daniel.vanderster@xxxxxxx> · Tue, 15 Feb 2022 09:00:51 +0100

Hi Maarten,

With `ceph osd pool ls detail` does it have pgp_num_target set to 2248?
If so, yes it's moving gradually to that number.

Cheers, Dan

> On 02/15/2022 8:55 AM Maarten van Ingen <maarten.vaningen@xxxxxxx> wrote:
> 
>  
> Hi,
> 
> After enabling the balancer (and set to upmap) on our environment it’s time to get the pgp_num on one of the pools on par with the pg_num.
> This pool has pg_num set to 4096 and pgp_num to 2048 (by our mistake).
> I just set the pgp_num to 2248 to keep data movement in check.
> 
> Oddly enough I see it’s only increased to 2108, also it’s odd we now get this health warning: 1 pools have pg_num > pgp_num, which we haven’t seen before…
> 
> 
> # ceph -s
>   cluster:
>     id:     <id>
>     health: HEALTH_WARN
>             1 pools have pg_num > pgp_num
> 
>   services:
>     mon: 5 daemons, quorum mon01,mon02,mon03,mon05,mon04 (age 3d)
>     mgr: mon01(active, since 3w), standbys: mon05, mon04, mon03, mon02
>     mds: cephfs:1 {0=mon04=up:active} 4 up:standby
>     osd: 1278 osds: 1278 up (since 68m), 1278 in (since 22h); 74 remapped pgs
> 
>   data:
>     pools:   28 pools, 13824 pgs
>     objects: 441.41M objects, 1.5 PiB
>     usage:   4.5 PiB used, 6.9 PiB / 11 PiB avail
>     pgs:     15652608/1324221126 objects misplaced (1.182%)
>              13693 active+clean
>              74    active+remapped+backfilling
>              56    active+clean+scrubbing+deep
>              1     active+clean+scrubbing
> 
>   io:
>     client:   187 MiB/s rd, 2.2 GiB/s wr, 11.11k op/s rd, 5.63k op/s wr
>     recovery: 1.8 GiB/s, 533 objects/s
> 
> 
> ceph osd pool get <pool> pgp_num
> pgp_num: 2108
> 
> Is this default behaviour from ceph?
> I get the feeling the balancer might have something to do here as well as we have set the balancer to only allow for 1% misplaced objects, to limit this as well. If that’s true, could I just set pgp_num to 4096 directly and CEPH limits the data movement by itself?
> 
> We are running a fully updated Nautilus cluster.
> 
> Met vriendelijke groet,
> Kind Regards,
> Maarten van Ingen
> 
> Specialist |SURF |maarten.vaningen@xxxxxxx<mailto:voornaam.achternaam@xxxxxxx>| T +31 30 88 787 3000 |M +31 6 19 03 90 19|
> SURF<http://www.surf.nl/> is the collaborative organisation for ICT in Dutch education and research
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx