Re: 6 pgs not deep-scrubbed in time

Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> · Tue, 30 Jan 2024 09:17:58 -0500

I now concur you should increase the pg_num as a first step for this
cluster. Disable the pg autoscaler for and increase the volumes pool to
pg_num 256. Then likely re-asses and make the next power of 2 jump to 512
and probably beyond.

Keep in mind this is not going to fix your short term deep-scrub issue in
fact it will increase the number of not scrubbed in time PGs until the
pg_num change is complete.  This is because OSDs dont scrub when they are
backfilling.

I would sit on 256 for a couple weeks and let scrubs happen then continue
past 256.

with the ultimate target of around 100-200 PGs per OSD which "ceph osd df
tree" will show you in the PGs column.

Respectfully,

*Wes Dillingham*
wes@xxxxxxxxxxxxxxxxx
LinkedIn <http://www.linkedin.com/in/wesleydillingham>

On Tue, Jan 30, 2024 at 3:16 AM Michel Niyoyita <micou12@xxxxxxxxx> wrote:

> Dear team,
>
> below is the output of ceph df command and the ceph version I am running
>
>  ceph df
> --- RAW STORAGE ---
> CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
> hdd    433 TiB  282 TiB  151 TiB   151 TiB      34.82
> TOTAL  433 TiB  282 TiB  151 TiB   151 TiB      34.82
>
> --- POOLS ---
> POOL                   ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
> device_health_metrics   1    1  1.1 MiB        3  3.2 MiB      0     73 TiB
> .rgw.root               2   32  3.7 KiB        8   96 KiB      0     73 TiB
> default.rgw.log         3   32  3.6 KiB      209  408 KiB      0     73 TiB
> default.rgw.control     4   32      0 B        8      0 B      0     73 TiB
> default.rgw.meta        5   32    382 B        2   24 KiB      0     73 TiB
> volumes                 6  128   21 TiB    5.68M   62 TiB  22.09     73 TiB
> images                  7   32  878 GiB  112.50k  2.6 TiB   1.17     73 TiB
> backups                 8   32      0 B        0      0 B      0     73 TiB
> vms                     9   32  881 GiB  174.30k  2.5 TiB   1.13     73 TiB
> testbench              10   32      0 B        0      0 B      0     73 TiB
> root@ceph-mon1:~# ceph --version
> ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific
> (stable)
> root@ceph-mon1:~#
>
> please advise accordingly
>
> Michel
>
> On Mon, Jan 29, 2024 at 9:48 PM Frank Schilder <frans@xxxxxx> wrote:
>
> > You will have to look at the output of "ceph df" and make a decision to
> > balance "objects per PG" and "GB per PG". Increase he PG count for the
> > pools with the worst of these two numbers most such that it balances out
> as
> > much as possible. If you have pools that see significantly more user-IO
> > than others, prioritise these.
> >
> > You will have to find out for your specific cluster, we can only give
> > general guidelines. Make changes, run benchmarks, re-evaluate. Take the
> > time for it. The better you know your cluster and your users, the better
> > the end result will be.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Michel Niyoyita <micou12@xxxxxxxxx>
> > Sent: Monday, January 29, 2024 2:04 PM
> > To: Janne Johansson
> > Cc: Frank Schilder; E Taka; ceph-users
> > Subject: Re:  Re: 6 pgs not deep-scrubbed in time
> >
> > This is how it is set , if you suggest to make some changes please
> advises.
> >
> > Thank you.
> >
> >
> > ceph osd pool ls detail
> > pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0
> > object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change
> 1407
> > flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application
> > mgr_devicehealth
> > pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
> > rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1393 flags
> > hashpspool stripe_width 0 application rgw
> > pool 3 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
> > object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> > 1394 flags hashpspool stripe_width 0 application rgw
> > pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
> > object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> > 1395 flags hashpspool stripe_width 0 application rgw
> > pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
> > object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
> > 1396 flags hashpspool stripe_width 0 pg_autoscale_bias 4 application rgw
> > pool 6 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash
> > rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 108802 lfor
> > 0/0/14812 flags hashpspool,selfmanaged_snaps stripe_width 0 application
> rbd
> >         removed_snaps_queue
> >
> [22d7~3,11561~2,11571~1,11573~1c,11594~6,1159b~f,115b0~1,115b3~1,115c3~1,115f3~1,115f5~e,11613~6,1161f~c,11637~1b,11660~1,11663~2,11673~1,116d1~c,116f5~10,11721~c]
> > pool 7 'images' replicated size 3 min_size 2 crush_rule 0 object_hash
> > rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 94609 flags
> > hashpspool,selfmanaged_snaps stripe_width 0 application rbd
> > pool 8 'backups' replicated size 3 min_size 2 crush_rule 0 object_hash
> > rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 1399 flags
> > hashpspool stripe_width 0 application rbd
> > pool 9 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash
> > rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 108783 lfor
> > 0/561/559 flags hashpspool,selfmanaged_snaps stripe_width 0 application
> rbd
> >         removed_snaps_queue [3fa~1,3fc~3,400~1,402~1]
> > pool 10 'testbench' replicated size 3 min_size 2 crush_rule 0 object_hash
> > rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 20931 lfor
> > 0/20931/20929 flags hashpspool stripe_width 0
> >
> >
> > On Mon, Jan 29, 2024 at 2:09 PM Michel Niyoyita <micou12@xxxxxxxxx
> <mailto:
> > micou12@xxxxxxxxx>> wrote:
> > Thank you Janne ,
> >
> > no need of setting some flags like ceph osd set nodeep-scrub  ???
> >
> > Thank you
> >
> > On Mon, Jan 29, 2024 at 2:04 PM Janne Johansson <icepic.dz@xxxxxxxxx
> > <mailto:icepic.dz@xxxxxxxxx>> wrote:
> > Den mån 29 jan. 2024 kl 12:58 skrev Michel Niyoyita <micou12@xxxxxxxxx
> > <mailto:micou12@xxxxxxxxx>>:
> > >
> > > Thank you Frank ,
> > >
> > > All disks are HDDs . Would like to know if I can increase the number of
> > PGs
> > > live in production without a negative impact to the cluster. if yes
> which
> > > commands to use .
> >
> > Yes. "ceph osd pool set <poolname> pg_num <number larger than before>"
> > where the number usually should be a power of two that leads to a
> > number of PGs per OSD between 100-200.
> >
> > --
> > May the most significant bit of your life be positive.
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx