Re: enabling pg_autoscaler on a large production storage?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Those df's and PG numbers all look fine to me.
I wouldn't start adjusting pg_num now -- leave the autoscaler module disabled.

Some might be concerned about having 190 PGs on an OSD, but this is
fine provided you have ample memory (at least 3GB per OSD).

Cheers, Dan

On Tue, Jun 16, 2020 at 2:23 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>
> Oh ok. Because we have two types of ssds. 1.8TB and 3.6TB.
> The 1.8TB got around 90-100pgs and the 3.6TB around 150-190pgs
>
> Here is the output:
> RAW STORAGE:
>     CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
>     ssd       956 TiB     360 TiB     595 TiB      596 TiB         62.35
>     TOTAL     956 TiB     360 TiB     595 TiB      596 TiB         62.35
>
> POOLS:
>     POOL         ID     STORED      OBJECTS     USED        %USED     MAX AVAIL
>     pool 1        1     580 GiB     149.31k     1.7 TiB      0.58        96 TiB
>     pool 3        3     208 TiB      61.66M     593 TiB     67.21        96 TiB
>     pool 4        4     7.9 MiB       2.01k      68 MiB         0        96 TiB
>     pool 5        5        19 B           2      36 KiB         0        96 TiB
>
>
>
> Am Di., 16. Juni 2020 um 14:13 Uhr schrieb Dan van der Ster
> <dan@xxxxxxxxxxxxxx>:
> >
> > On Tue, Jun 16, 2020 at 2:00 PM Boris Behrens <bb@xxxxxxxxx> wrote:
> > >
> > > See inline comments
> > > Dan van der Ster <dan@xxxxxxxxxxxxxx> 于2020年6月16日周二 下午7:07写道:
> > > >
> > > > Could you share the output of
> > > >
> > > >     ceph osd pool ls detail
> > >
> > > pool 1 'pool 1' replicated size 3 min_size 2 crush_rule 0 object_hash
> > > rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change
> > > 2318859 flags hashpspool min_write_recency_for_promote 1 stripe_width
> > > 0 application rbd
> > > pool 3 'pool 3' replicated size 3 min_size 2 crush_rule 0 object_hash
> > > rjenkins pg_num 16384 pgp_num 16384 autoscale_mode warn last_change
> > > 2544040 lfor 0/0/1952329 flags hashpspool,selfmanaged_snaps
> > > min_write_recency_for_promote 1 stripe_width 0 application rbd
> > > pool 4 'pool 4' replicated size 3 min_size 2 crush_rule 0 object_hash
> > > rjenkins pg_num 8 pgp_num 8 autoscale_mode warn last_change 2318859
> > > flags hashpspool min_write_recency_for_promote 1 stripe_width 0
> > > application rbd
> > > pool 5 'pool 5' replicated size 3 min_size 2 crush_rule 0 object_hash
> > > rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change
> > > 2318859 flags hashpspool,selfmanaged_snaps
> > > min_write_recency_for_promote 1 stripe_width 0 application rbd
> > >
> >
> > OK now maybe share the output of `ceph df` so we can see how much data
> > is in each pool?
> >
> > Assuming that the majority of your data is in 'pool 3' with 16384 PGs,
> > then your current PG values are just fine. (You should have around 110
> > PGs per OSD).
> > The pg_autoscaler aims for 100 per OSD and doesn't make changes unless
> > a pool has 4x too few or too many PGs.
> >
> > Unless you are planning to put a large proportion of data into the
> > other pools, I'd leave pg_autoscaler disabled and move on to the next
> > task.
> >
> > -- Dan
> >
> > > the mgr module is not enabled yet.
> > >
> > > >
> > > > ?
> > > >
> > > > This way we can see how the pools are configured and help recommend if
> > > > pg_autoscaler is worth enabling.
> > > >
> > >
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux