Fwd: pool pgp_num not updated

Mac Wynkoop <mwynkoop@xxxxxxxxxxxx> · Thu, 8 Oct 2020 08:31:52 -0500

Just making sure this makes the list:
Mac Wynkoop

---------- Forwarded message ---------
From: 胡 玮文 <huww98@xxxxxxxxxxx>
Date: Wed, Oct 7, 2020 at 9:00 PM
Subject: Re: pool pgp_num not updated
To: Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>

Hi,

You can read about this behavior at
https://ceph.io/rados/new-in-nautilus-pg-merging-and-autotuning/

In short, ceph will not increase pgp_num if misplaced > 5% (by default),
and once you got misplaced < 5%, it will increase pgp_num gradually, until
reaching the value you set. This 5% can be configured by
target_max_misplaced_ratio config option.

在 2020年10月8日，03:22，Mac Wynkoop <mwynkoop@xxxxxxxxxxxx> 写道：

Well, backfilling sure, but will it allow me to actually change the pgp_num
as more space frees up? Because the issue is that I cannot modify that
value.

Thanks,
Mac Wynkoop, Senior Datacenter Engineer
*NetDepot.com:* Cloud Servers; Delivered
Houston | Atlanta | NYC | Colorado Springs

1-844-25-CLOUD Ext 806

On Wed, Oct 7, 2020 at 1:50 PM Eugen Block <eblock@xxxxxx> wrote:

Yes, I think that’s exactly the reason. As soon as the cluster has

more space the backfill will continue.

Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>:

The cluster is currently in a warn state, here's the scrubbed output of

ceph -s:

*cluster:    id:     *redacted*    health: HEALTH_WARN

noscrub,nodeep-scrub flag(s) set            22 nearfull osd(s)

 2

pool(s) nearfull            Low space hindering backfill (add storage if

this doesn't resolve itself): 277 pgs backfill_toofull

Degraded

data redundancy: 32652738/3651947772 objects degraded (0.894%), 281 pgs

degraded, 341 pgs undersized            1214 pgs not deep-scrubbed in

time

         2647 pgs not scrubbed in time            2 daemons have

recently

crashed   services:    mon:         5 daemons, *redacted* (age 44h)

mgr:

       *redacted*    osd:         162 osds: 162 up (since 44h), 162 in

(since 4d); 971 remapped pgs                 flags noscrub,nodeep-scrub

rgw:         3 daemons active *redacted*    tcmu-runner: 18 daemons

active

*redacted*   data:    pools:   10 pools, 2648 pgs    objects: 409.56M

objects, 738 TiB    usage:   1.3 PiB used, 580 TiB / 1.8 PiB avail

pgs:

   32652738/3651947772 objects degraded (0.894%)

517370913/3651947772 objects misplaced (14.167%)             1677

active+clean             477  active+remapped+backfill_wait

100

active+remapped+backfill_wait+backfill_toofull             80

active+undersized+degraded+remapped+backfill_wait             60

active+undersized+degraded+remapped+backfill_wait+backfill_toofull

  42   active+undersized+degraded+remapped+backfill_toofull

33

 active+undersized+degraded+remapped+backfilling             25

active+remapped+backfilling             25

active+remapped+backfill_toofull             24

active+undersized+remapped+backfilling             23

active+forced_recovery+undersized+degraded+remapped+backfill_wait

  19

active+forced_recovery+undersized+degraded+remapped+backfill_wait+backfill_toofull

          15   active+undersized+remapped+backfill_wait             14

active+undersized+remapped+backfill_wait+backfill_toofull             12

active+forced_recovery+undersized+degraded+remapped+backfill_toofull

    12   active+forced_recovery+undersized+degraded+remapped+backfilling

          5    active+undersized+remapped+backfill_toofull             3

active+remapped             1    active+undersized+remapped

1

  active+forced_recovery+undersized+remapped+backfilling   io:

client:

 287 MiB/s rd, 40 MiB/s wr, 1.94k op/s rd, 165 op/s wr    recovery: 425

MiB/s, 225 objects/s*

Now as you can see, we do have a lot of backfill operations going on at

the

moment. Does that actually prevent Ceph from modifying the pgp_num value

of

a pool?

Thanks,

Mac Wynkoop

On Wed, Oct 7, 2020 at 8:57 AM Eugen Block <eblock@xxxxxx> wrote:

What is the current cluster status, is it healthy? Maybe increasing

pg_num would hit the limit of mon_max_pg_per_osd? Can you share 'ceph

-s' output?

Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>:

Right, both Norman and I set the pg_num before the pgp_num. For

example,

here is my current pool settings:

*"pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7

crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024

pgp_num_target

2048 last_change 8458830 lfor 0/0/8445757 flags

hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576

fast_read

1 application rgw"*

So, when I set:

"*ceph osd pool set hou-ec-1.rgw.buckets.data pgp_num 2048*"

it returns:

"*set pool 40 pgp_num to 2048*"

But upon checking the pool details again:

"*pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7

crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024

pgp_num_target

2048 last_change 8458870 lfor 0/0/8445757 flags

hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576

fast_read

1 application rgw*"

and the pgp_num value does not increase. Am I just doing something

totally wrong?

Thanks,

Mac Wynkoop

On Tue, Oct 6, 2020 at 2:32 PM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx>

wrote:

pg_num and pgp_num need to be the same, not?

3.5.1. Set the Number of PGs

To set the number of placement groups in a pool, you must specify the

number of placement groups at the time you create the pool. See

Create a

Pool for details. Once you set placement groups for a pool, you can

increase the number of placement groups (but you cannot decrease the

number of placement groups). To increase the number of placement

groups,

execute the following:

ceph osd pool set {pool-name} pg_num {pg_num}

Once you increase the number of placement groups, you must also

increase

the number of placement groups for placement (pgp_num) before your

cluster will rebalance. The pgp_num should be equal to the pg_num. To

increase the number of placement groups for placement, execute the

following:

ceph osd pool set {pool-name} pgp_num {pgp_num}

https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/storage_strategies_guide/placement_groups_pgs

-----Original Message-----

To: norman

Cc: ceph-users

Subject:  Re: pool pgp_num not updated

Hi everyone,

I'm seeing a similar issue here. Any ideas on this?

Mac Wynkoop,

On Sun, Sep 6, 2020 at 11:09 PM norman <norman.kern@xxxxxxx> wrote:

Hi guys,

When I update the pg_num of a pool, I found it not worked(no

rebalanced), anyone know the reason? Pool's info:

pool 21 'openstack-volumes-rs' replicated size 3 min_size 2

crush_rule

21 object_hash rjenkins pg_num 1024 pgp_num 512 pgp_num_target 1024

autoscale_mode warn last_change 85103 lfor 82044/82044/82044 flags

hashpspool,nodelete,selfmanaged_snaps stripe_width 0 application

rbd

        removed_snaps

[1~1e6,1e8~300,4e9~18,502~3f,542~11,554~1a,56f~1d7]

pool 22 'openstack-vms-rs' replicated size 3 min_size 2 crush_rule

22

object_hash rjenkins pg_num 512 pgp_num 512 pg_num_target 256

pgp_num_target 256 autoscale_mode warn last_change 84769 lfor

0/0/55294 flags hashpspool,nodelete,selfmanaged_snaps stripe_width

0

application rbd

The pgp_num_target is set, but pgp_num not set.

I have scale out new OSDs and is backfilling before setting the

value,

is it the reason?

_______________________________________________

ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send

an

email to ceph-users-leave@xxxxxxx

_______________________________________________

ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an

email to ceph-users-leave@xxxxxxx

_______________________________________________

ceph-users mailing list -- ceph-users@xxxxxxx

To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________

ceph-users mailing list -- ceph-users@xxxxxxx

To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx