Re: pool pgp_num not updated

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alrighty, so we're all recovered and balanced at this point, but I'm not
seeing this behavior:


*pool 40 'hou-ec-1.rgw.buckets.data' erasure size 9 min_size 7 crush_rule 2
object_hash rjenkins pg_num 2048 pgp_num 1109 pgp_num_target 2048
last_change 8654141 lfor 0/0/8445757 flags
hashpspool,ec_overwrites,nodelete stripe_width 24576 fast_read 1
application rgw*
I don't have autoscaler enabled for the cluster, or this pool, but the
pgp_num is slowly incrementing up to the pgp_num_target value. If
Autoscaler isn't on, what part of Ceph is handling the increase of pgp_num?
Because I'd like to turn up the rate at which it splits the PG's, but if
autoscaler isn't doing it, I'd have no clue what to adjust. Any ideas?

Thanks,
Mac Wynkoop





On Thu, Oct 8, 2020 at 8:16 AM Mac Wynkoop <mwynkoop@xxxxxxxxxxxx> wrote:

> OK, great. We'll keep tabs on it for now then and try again once we're
> fully rebalanced.
> Mac Wynkoop, Senior Datacenter Engineer
> *NetDepot.com:* Cloud Servers; Delivered
> Houston | Atlanta | NYC | Colorado Springs
>
> 1-844-25-CLOUD Ext 806
>
>
>
>
> On Thu, Oct 8, 2020 at 2:08 AM Eugen Block <eblock@xxxxxx> wrote:
>
>> Yes, after your cluster has recovered you'll be able to increase
>> pgp_num. Or your change will be applied automatically since you
>> already set it, I'm not sure but you'll see.
>>
>>
>> Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>:
>>
>> > Well, backfilling sure, but will it allow me to actually change the
>> pgp_num
>> > as more space frees up? Because the issue is that I cannot modify that
>> > value.
>> >
>> > Thanks,
>> > Mac Wynkoop, Senior Datacenter Engineer
>> > *NetDepot.com:* Cloud Servers; Delivered
>> > Houston | Atlanta | NYC | Colorado Springs
>> >
>> > 1-844-25-CLOUD Ext 806
>> >
>> >
>> >
>> >
>> > On Wed, Oct 7, 2020 at 1:50 PM Eugen Block <eblock@xxxxxx> wrote:
>> >
>> >> Yes, I think that’s exactly the reason. As soon as the cluster has
>> >> more space the backfill will continue.
>> >>
>> >>
>> >> Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>:
>> >>
>> >> > The cluster is currently in a warn state, here's the scrubbed output
>> of
>> >> > ceph -s:
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > *cluster:    id:     *redacted*    health: HEALTH_WARN
>> >> > noscrub,nodeep-scrub flag(s) set            22 nearfull osd(s)
>> >>   2
>> >> > pool(s) nearfull            Low space hindering backfill (add
>> storage if
>> >> > this doesn't resolve itself): 277 pgs backfill_toofull
>> >> Degraded
>> >> > data redundancy: 32652738/3651947772 objects degraded (0.894%), 281
>> pgs
>> >> > degraded, 341 pgs undersized            1214 pgs not deep-scrubbed in
>> >> time
>> >> >           2647 pgs not scrubbed in time            2 daemons have
>> >> recently
>> >> > crashed   services:    mon:         5 daemons, *redacted* (age 44h)
>> >> mgr:
>> >> >         *redacted*    osd:         162 osds: 162 up (since 44h), 162
>> in
>> >> > (since 4d); 971 remapped pgs                 flags
>> noscrub,nodeep-scrub
>> >> > rgw:         3 daemons active *redacted*    tcmu-runner: 18 daemons
>> >> active
>> >> > *redacted*   data:    pools:   10 pools, 2648 pgs    objects: 409.56M
>> >> > objects, 738 TiB    usage:   1.3 PiB used, 580 TiB / 1.8 PiB avail
>> >> pgs:
>> >> >     32652738/3651947772 objects degraded (0.894%)
>> >> >  517370913/3651947772 objects misplaced (14.167%)             1677
>> >> > active+clean             477  active+remapped+backfill_wait
>> >>  100
>> >> >  active+remapped+backfill_wait+backfill_toofull             80
>> >> > active+undersized+degraded+remapped+backfill_wait             60
>> >> > active+undersized+degraded+remapped+backfill_wait+backfill_toofull
>> >> >    42   active+undersized+degraded+remapped+backfill_toofull
>> >>  33
>> >> >   active+undersized+degraded+remapped+backfilling             25
>> >> > active+remapped+backfilling             25
>> >> > active+remapped+backfill_toofull             24
>> >> > active+undersized+remapped+backfilling             23
>> >> > active+forced_recovery+undersized+degraded+remapped+backfill_wait
>> >> >    19
>> >> >
>> >>
>> active+forced_recovery+undersized+degraded+remapped+backfill_wait+backfill_toofull
>> >> >            15   active+undersized+remapped+backfill_wait
>>  14
>> >> > active+undersized+remapped+backfill_wait+backfill_toofull
>>  12
>> >> > active+forced_recovery+undersized+degraded+remapped+backfill_toofull
>> >> >      12
>>  active+forced_recovery+undersized+degraded+remapped+backfilling
>> >> >            5    active+undersized+remapped+backfill_toofull
>>    3
>> >> >  active+remapped             1    active+undersized+remapped
>> >>  1
>> >> >    active+forced_recovery+undersized+remapped+backfilling   io:
>> >> client:
>> >> >   287 MiB/s rd, 40 MiB/s wr, 1.94k op/s rd, 165 op/s wr    recovery:
>> 425
>> >> > MiB/s, 225 objects/s*
>> >> > Now as you can see, we do have a lot of backfill operations going on
>> at
>> >> the
>> >> > moment. Does that actually prevent Ceph from modifying the pgp_num
>> value
>> >> of
>> >> > a pool?
>> >> >
>> >> > Thanks,
>> >> > Mac Wynkoop
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Oct 7, 2020 at 8:57 AM Eugen Block <eblock@xxxxxx> wrote:
>> >> >
>> >> >> What is the current cluster status, is it healthy? Maybe increasing
>> >> >> pg_num would hit the limit of mon_max_pg_per_osd? Can you share
>> 'ceph
>> >> >> -s' output?
>> >> >>
>> >> >>
>> >> >> Zitat von Mac Wynkoop <mwynkoop@xxxxxxxxxxxx>:
>> >> >>
>> >> >> > Right, both Norman and I set the pg_num before the pgp_num. For
>> >> example,
>> >> >> > here is my current pool settings:
>> >> >> >
>> >> >> >
>> >> >> > *"pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7
>> >> >> > crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024
>> >> pgp_num_target
>> >> >> > 2048 last_change 8458830 lfor 0/0/8445757 flags
>> >> >> > hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576
>> >> >> fast_read
>> >> >> > 1 application rgw"*
>> >> >> > So, when I set:
>> >> >> >
>> >> >> >  "*ceph osd pool set hou-ec-1.rgw.buckets.data pgp_num 2048*"
>> >> >> >
>> >> >> > it returns:
>> >> >> >
>> >> >> > "*set pool 40 pgp_num to 2048*"
>> >> >> >
>> >> >> > But upon checking the pool details again:
>> >> >> >
>> >> >> > "*pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7
>> >> >> > crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024
>> >> pgp_num_target
>> >> >> > 2048 last_change 8458870 lfor 0/0/8445757 flags
>> >> >> > hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576
>> >> >> fast_read
>> >> >> > 1 application rgw*"
>> >> >> >
>> >> >> > and the pgp_num value does not increase. Am I just doing something
>> >> >> > totally wrong?
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Mac Wynkoop
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Tue, Oct 6, 2020 at 2:32 PM Marc Roos <
>> M.Roos@xxxxxxxxxxxxxxxxx>
>> >> >> wrote:
>> >> >> >
>> >> >> >> pg_num and pgp_num need to be the same, not?
>> >> >> >>
>> >> >> >> 3.5.1. Set the Number of PGs
>> >> >> >>
>> >> >> >> To set the number of placement groups in a pool, you must
>> specify the
>> >> >> >> number of placement groups at the time you create the pool. See
>> >> Create a
>> >> >> >> Pool for details. Once you set placement groups for a pool, you
>> can
>> >> >> >> increase the number of placement groups (but you cannot decrease
>> the
>> >> >> >> number of placement groups). To increase the number of placement
>> >> groups,
>> >> >> >> execute the following:
>> >> >> >>
>> >> >> >> ceph osd pool set {pool-name} pg_num {pg_num}
>> >> >> >>
>> >> >> >> Once you increase the number of placement groups, you must also
>> >> increase
>> >> >> >> the number of placement groups for placement (pgp_num) before
>> your
>> >> >> >> cluster will rebalance. The pgp_num should be equal to the
>> pg_num. To
>> >> >> >> increase the number of placement groups for placement, execute
>> the
>> >> >> >> following:
>> >> >> >>
>> >> >> >> ceph osd pool set {pool-name} pgp_num {pgp_num}
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/storage_strategies_guide/placement_groups_pgs
>> >> >> >>
>> >> >> >> -----Original Message-----
>> >> >> >> To: norman
>> >> >> >> Cc: ceph-users
>> >> >> >> Subject:  Re: pool pgp_num not updated
>> >> >> >>
>> >> >> >> Hi everyone,
>> >> >> >>
>> >> >> >> I'm seeing a similar issue here. Any ideas on this?
>> >> >> >> Mac Wynkoop,
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Sun, Sep 6, 2020 at 11:09 PM norman <norman.kern@xxxxxxx>
>> wrote:
>> >> >> >>
>> >> >> >> > Hi guys,
>> >> >> >> >
>> >> >> >> > When I update the pg_num of a pool, I found it not worked(no
>> >> >> >> > rebalanced), anyone know the reason? Pool's info:
>> >> >> >> >
>> >> >> >> > pool 21 'openstack-volumes-rs' replicated size 3 min_size 2
>> >> crush_rule
>> >> >> >> > 21 object_hash rjenkins pg_num 1024 pgp_num 512 pgp_num_target
>> 1024
>> >> >> >> > autoscale_mode warn last_change 85103 lfor 82044/82044/82044
>> flags
>> >> >> >> > hashpspool,nodelete,selfmanaged_snaps stripe_width 0
>> application
>> >> rbd
>> >> >> >> >          removed_snaps
>> >> >> >> > [1~1e6,1e8~300,4e9~18,502~3f,542~11,554~1a,56f~1d7]
>> >> >> >> > pool 22 'openstack-vms-rs' replicated size 3 min_size 2
>> crush_rule
>> >> 22
>> >> >> >> > object_hash rjenkins pg_num 512 pgp_num 512 pg_num_target 256
>> >> >> >> > pgp_num_target 256 autoscale_mode warn last_change 84769 lfor
>> >> >> >> > 0/0/55294 flags hashpspool,nodelete,selfmanaged_snaps
>> stripe_width
>> >>
>> >> >> >> > application rbd
>> >> >> >> >
>> >> >> >> > The pgp_num_target is set, but pgp_num not set.
>> >> >> >> >
>> >> >> >> > I have scale out new OSDs and is backfilling before setting the
>> >> value,
>> >> >> >>
>> >> >> >> > is it the reason?
>> >> >> >> > _______________________________________________
>> >> >> >> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe
>> send
>> >> an
>> >> >> >> > email to ceph-users-leave@xxxxxxx
>> >> >> >> >
>> >> >> >> _______________________________________________
>> >> >> >> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe
>> send an
>> >> >> >> email to ceph-users-leave@xxxxxxx
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> > _______________________________________________
>> >> >> > ceph-users mailing list -- ceph-users@xxxxxxx
>> >> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> ceph-users mailing list -- ceph-users@xxxxxxx
>> >> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >> >>
>> >>
>> >>
>> >>
>> >>
>>
>>
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux