Re: should I increase the amount of PGs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It would be safe to turn off the balancer, yes go ahead.

To know if adding more hardware will help, we need to see how much
longer this current splitting should take. This will help:

    ceph status
    ceph osd pool ls detail

-- dan

On Tue, Mar 30, 2021 at 3:00 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>
> I would think due to splitting, because the balancer doesn't refuses it's work, because to many misplaced objects.
> I also think to turn it off for now, so it doesn't begin it's work at 5% missplaced objects.
>
> Would adding more hardware help? We wanted to insert another OSD node with 7x8TB disks anyway, but postponed it due to the rebalancing.
>
> Am Di., 30. März 2021 um 14:23 Uhr schrieb Dan van der Ster <dan@xxxxxxxxxxxxxx>:
>>
>> Are those PGs backfilling due to splitting or due to balancing?
>> If it's the former, I don't think there's a way to pause them with
>> upmap or any other trick.
>>
>> -- dan
>>
>> On Tue, Mar 30, 2021 at 2:07 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>> >
>> > One week later the ceph is still balancing.
>> > What worries me like hell is the %USE on a lot of those OSDs. Does ceph
>> > resolv this on it's own? We are currently down to 5TB space in the cluster.
>> > Rebalancing single OSDs doesn't work well and it increases the "missplaced
>> > objects".
>> >
>> > I thought about letting upmap do some rebalancing. Anyone know if this is a
>> > good idea? Or if I should bite my nails an wait as I am the headache of my
>> > life.
>> > [root@s3db1 ~]# ceph osd getmap -o om; osdmaptool om --upmap out.txt
>> > --upmap-pool eu-central-1.rgw.buckets.data --upmap-max 10; cat out.txt
>> > got osdmap epoch 321975
>> > osdmaptool: osdmap file 'om'
>> > writing upmap command output to: out.txt
>> > checking for upmap cleanups
>> > upmap, max-count 10, max deviation 5
>> >  limiting to pools eu-central-1.rgw.buckets.data ([11])
>> > pools eu-central-1.rgw.buckets.data
>> > prepared 10/10 changes
>> > ceph osd rm-pg-upmap-items 11.209
>> > ceph osd rm-pg-upmap-items 11.253
>> > ceph osd pg-upmap-items 11.7f 79 88
>> > ceph osd pg-upmap-items 11.fc 53 31 105 78
>> > ceph osd pg-upmap-items 11.1d8 84 50
>> > ceph osd pg-upmap-items 11.47f 94 86
>> > ceph osd pg-upmap-items 11.49c 44 71
>> > ceph osd pg-upmap-items 11.553 74 50
>> > ceph osd pg-upmap-items 11.6c3 66 63
>> > ceph osd pg-upmap-items 11.7ad 43 50
>> >
>> > ID  CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA     OMAP     META
>> >  AVAIL    %USE  VAR  PGS STATUS TYPE NAME
>> >  -1       795.42548        - 795 TiB 626 TiB  587 TiB   82 GiB 1.4 TiB  170
>> > TiB 78.64 1.00   -        root default
>> >  56   hdd   7.32619  1.00000 7.3 TiB 6.4 TiB  6.4 TiB  684 MiB  16 GiB  910
>> > GiB 87.87 1.12 129     up         osd.56
>> >  67   hdd   7.27739  1.00000 7.3 TiB 6.4 TiB  6.4 TiB  582 MiB  16 GiB  865
>> > GiB 88.40 1.12 115     up         osd.67
>> >  79   hdd   3.63689  1.00000 3.6 TiB 3.2 TiB  432 GiB  1.9 GiB     0 B  432
>> > GiB 88.40 1.12  63     up         osd.79
>> >  53   hdd   7.32619  1.00000 7.3 TiB 6.5 TiB  6.4 TiB  971 MiB  22 GiB  864
>> > GiB 88.48 1.13 114     up         osd.53
>> >  51   hdd   7.27739  1.00000 7.3 TiB 6.5 TiB  6.4 TiB  734 MiB  15 GiB  837
>> > GiB 88.77 1.13 120     up         osd.51
>> >  73   hdd  14.55269  1.00000  15 TiB  13 TiB   13 TiB  1.8 GiB  39 GiB  1.6
>> > TiB 88.97 1.13 246     up         osd.73
>> >  55   hdd   7.32619  1.00000 7.3 TiB 6.5 TiB  6.5 TiB  259 MiB  15 GiB  825
>> > GiB 89.01 1.13 118     up         osd.55
>> >  70   hdd   7.27739  1.00000 7.3 TiB 6.5 TiB  6.5 TiB  291 MiB  16 GiB  787
>> > GiB 89.44 1.14 119     up         osd.70
>> >  42   hdd   3.73630  1.00000 3.7 TiB 3.4 TiB  3.3 TiB  685 MiB 8.2 GiB  374
>> > GiB 90.23 1.15  60     up         osd.42
>> >  94   hdd   3.63869  1.00000 3.6 TiB 3.3 TiB  3.3 TiB  132 MiB 7.7 GiB  345
>> > GiB 90.75 1.15  64     up         osd.94
>> >  25   hdd   3.73630  1.00000 3.7 TiB 3.4 TiB  3.3 TiB  3.2 MiB 8.1 GiB  352
>> > GiB 90.79 1.15  53     up         osd.25
>> >  31   hdd   7.32619  1.00000 7.3 TiB 6.7 TiB  6.6 TiB  223 MiB  15 GiB  690
>> > GiB 90.80 1.15 117     up         osd.31
>> >  84   hdd   7.52150  1.00000 7.5 TiB 6.8 TiB  6.6 TiB  159 MiB  16 GiB  699
>> > GiB 90.93 1.16 121     up         osd.84
>> >  82   hdd   3.63689  1.00000 3.6 TiB 3.3 TiB  332 GiB  1.0 GiB     0 B  332
>> > GiB 91.08 1.16  59     up         osd.82
>> >  89   hdd   7.52150  1.00000 7.5 TiB 6.9 TiB  6.6 TiB  400 MiB  15 GiB  670
>> > GiB 91.29 1.16 126     up         osd.89
>> >  33   hdd   3.73630  1.00000 3.7 TiB 3.4 TiB  3.3 TiB  382 MiB 8.6 GiB  327
>> > GiB 91.46 1.16  66     up         osd.33
>> >  90   hdd   7.52150  1.00000 7.5 TiB 6.9 TiB  6.6 TiB  338 MiB  15 GiB  658
>> > GiB 91.46 1.16 112     up         osd.90
>> > 105   hdd   3.63869  0.89999 3.6 TiB 3.3 TiB  3.3 TiB  206 MiB 8.1 GiB  301
>> > GiB 91.91 1.17  56     up         osd.105
>> >  66   hdd   7.27739  0.95000 7.3 TiB 6.7 TiB  6.7 TiB  322 MiB  16 GiB  548
>> > GiB 92.64 1.18 121     up         osd.66
>> >  46   hdd   7.27739  1.00000 7.3 TiB 6.8 TiB  6.7 TiB  316 MiB  16 GiB  536
>> > GiB 92.81 1.18 119     up         osd.46
>> >
>> > Am Di., 23. März 2021 um 19:59 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:
>> >
>> > > Good point. Thanks for the hint. I changed it for all OSDs from 5 to 1
>> > > *crossing finger*
>> > >
>> > > Am Di., 23. März 2021 um 19:45 Uhr schrieb Dan van der Ster <
>> > > dan@xxxxxxxxxxxxxx>:
>> > >
>> > >> I see. When splitting PGs, the OSDs will increase is used space
>> > >> temporarily to make room for the new PGs.
>> > >> When going from 1024->2048 PGs, that means that half of the objects from
>> > >> each PG will be copied to a new PG, and then the previous PGs will have
>> > >> those objects deleted.
>> > >>
>> > >> Make sure osd_max_backfills is set to 1, so that not too many PGs are
>> > >> moving concurrently.
>> > >>
>> > >>
>> > >>
>> > >> On Tue, Mar 23, 2021, 7:39 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>> > >>
>> > >>> Thank you.
>> > >>> Currently I do not have any full OSDs (all <90%) but I keep this in mind.
>> > >>> What worries me is the ever increasing %USE metric (it went up from
>> > >>> around 72% to 75% in three hours). It looks like there is comming a lot of
>> > >>> data (there comes barely new data at the moment), but I think this might
>> > >>> have to do with my "let's try to increase the PGs to 2048". I hope that
>> > >>> ceph begins to split the old PGs into new ones and removes the old PGs.
>> > >>>
>> > >>> ID  CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META
>> > >>> AVAIL    %USE  VAR  PGS STATUS TYPE NAME
>> > >>>  -1       795.42548        - 795 TiB 597 TiB 556 TiB  88 GiB  1.4 TiB
>> > >>>  198 TiB 75.12 1.00   -        root default
>> > >>>
>> > >>> Am Di., 23. März 2021 um 19:21 Uhr schrieb Dan van der Ster <
>> > >>> dan@xxxxxxxxxxxxxx>:
>> > >>>
>> > >>>> While you're watching things, if an OSD is getting too close for
>> > >>>> comfort to the full ratio, you can temporarily increase it, e.g.
>> > >>>>     ceph osd set-full-ratio 0.96
>> > >>>>
>> > >>>> But don't set that too high -- you can really break an OSD if it gets
>> > >>>> 100% full (and then can't delete objects or whatever...)
>> > >>>>
>> > >>>> -- dan
>> > >>>>
>> > >>>> On Tue, Mar 23, 2021 at 7:17 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>> > >>>> >
>> > >>>> > Ok, then I will try to reweight the most filled OSDs to .95 and see
>> > >>>> if this helps.
>> > >>>> >
>> > >>>> > Am Di., 23. März 2021 um 19:13 Uhr schrieb Dan van der Ster <
>> > >>>> dan@xxxxxxxxxxxxxx>:
>> > >>>> >>
>> > >>>> >> Data goes to *all* PGs uniformly.
>> > >>>> >> Max_avail is limited by the available space on the most full OSD --
>> > >>>> >> you should pay close attention to those and make sure they are moving
>> > >>>> >> in the right direction (decreasing!)
>> > >>>> >>
>> > >>>> >> Another point -- IMHO you should aim to get all PGs active+clean
>> > >>>> >> before you add yet another batch of new disks. While there are PGs
>> > >>>> >> backfilling, your osdmaps are accumulating on the mons and osds --
>> > >>>> >> this itself will start to use a lot of space, and active+clean is the
>> > >>>> >> only way to trim the old maps.
>> > >>>> >>
>> > >>>> >> -- dan
>> > >>>> >>
>> > >>>> >> On Tue, Mar 23, 2021 at 7:05 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>> > >>>> >> >
>> > >>>> >> > So,
>> > >>>> >> > doing nothing and wait for the ceph to recover?
>> > >>>> >> >
>> > >>>> >> > In theory there should be enough disk space (more disks arriving
>> > >>>> tomorrow), but I fear that there might be an issue, when the backups get
>> > >>>> exported over night to this s3. Currently the max_avail lingers around 13TB
>> > >>>> and I hope, that the data will go to other PGs than the ones that are
>> > >>>> currently on filled OSDs.
>> > >>>> >> >
>> > >>>> >> >
>> > >>>> >> >
>> > >>>> >> > Am Di., 23. März 2021 um 18:58 Uhr schrieb Dan van der Ster <
>> > >>>> dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >>
>> > >>>> >> >> Hi,
>> > >>>> >> >>
>> > >>>> >> >> backfill_toofull is not a bad thing when the cluster is really
>> > >>>> full
>> > >>>> >> >> like yours. You should expect some of the most full OSDs to
>> > >>>> eventually
>> > >>>> >> >> start decreasing in usage, as the PGs are moved to the new OSDs.
>> > >>>> Those
>> > >>>> >> >> backfill_toofull states should then resolve themselves as the OSD
>> > >>>> >> >> usage flattens out.
>> > >>>> >> >> Keep an eye on the usage of the backfill_full and nearfull OSDs
>> > >>>> though
>> > >>>> >> >> -- if they do eventually go above the full_ratio (95% by default),
>> > >>>> >> >> then writes to those OSDs would stop.
>> > >>>> >> >>
>> > >>>> >> >> But if on the other hand you're suffering from lots of slow ops or
>> > >>>> >> >> anything else visible to your users, then you could try to take
>> > >>>> some
>> > >>>> >> >> actions to slow down the rebalancing. Just let us know if that's
>> > >>>> the
>> > >>>> >> >> case and we can see about changing osd_max_backfills, some
>> > >>>> weights or
>> > >>>> >> >> maybe using the upmap-remapped tool.
>> > >>>> >> >>
>> > >>>> >> >> -- Dan
>> > >>>> >> >>
>> > >>>> >> >> On Tue, Mar 23, 2021 at 6:07 PM Boris Behrens <bb@xxxxxxxxx>
>> > >>>> wrote:
>> > >>>> >> >> >
>> > >>>> >> >> > Ok, I should have listened to you :)
>> > >>>> >> >> >
>> > >>>> >> >> > In the last week we added more storage but the issue got worse
>> > >>>> instead.
>> > >>>> >> >> > Today I realized that the PGs were up to 90GB (bytes column in
>> > >>>> ceph pg ls said 95705749636), and the balance kept mentioning the 2048 PGs
>> > >>>> for this pool. We were at 72% utilization (ceph osd df tree, first line)
>> > >>>> for our cluster and I increased the PGs to 2048.
>> > >>>> >> >> >
>> > >>>> >> >> > Now I am in a world of trouble.
>> > >>>> >> >> > The space in the cluster went down, I am at 45% misplaced
>> > >>>> objects, and we already added 20x4TB disks just to not go down completly.
>> > >>>> >> >> >
>> > >>>> >> >> > The utilization is still going up and the overall free space in
>> > >>>> the cluster seems to go down. This is what my ceph status looks like and
>> > >>>> now I really need help to get that thing back to normal:
>> > >>>> >> >> > [root@s3db1 ~]# ceph status
>> > >>>> >> >> >   cluster:
>> > >>>> >> >> >     id:     dca79fff-ffd0-58f4-1cff-82a2feea05f4
>> > >>>> >> >> >     health: HEALTH_WARN
>> > >>>> >> >> >             4 backfillfull osd(s)
>> > >>>> >> >> >             17 nearfull osd(s)
>> > >>>> >> >> >             37 pool(s) backfillfull
>> > >>>> >> >> >             13 large omap objects
>> > >>>> >> >> >             Low space hindering backfill (add storage if this
>> > >>>> doesn't resolve itself): 570 pgs backfill_toofull
>> > >>>> >> >> >
>> > >>>> >> >> >   services:
>> > >>>> >> >> >     mon: 3 daemons, quorum
>> > >>>> ceph-s3-mon1,ceph-s3-mon2,ceph-s3-mon3 (age 44m)
>> > >>>> >> >> >     mgr: ceph-mgr2(active, since 15m), standbys: ceph-mgr3,
>> > >>>> ceph-mgr1
>> > >>>> >> >> >     mds:  3 up:standby
>> > >>>> >> >> >     osd: 110 osds: 110 up (since 28m), 110 in (since 28m); 1535
>> > >>>> remapped pgs
>> > >>>> >> >> >     rgw: 3 daemons active (eu-central-1, eu-msg-1, eu-secure-1)
>> > >>>> >> >> >
>> > >>>> >> >> >   task status:
>> > >>>> >> >> >
>> > >>>> >> >> >   data:
>> > >>>> >> >> >     pools:   37 pools, 4032 pgs
>> > >>>> >> >> >     objects: 116.23M objects, 182 TiB
>> > >>>> >> >> >     usage:   589 TiB used, 206 TiB / 795 TiB avail
>> > >>>> >> >> >     pgs:     160918554/348689415 objects misplaced (46.150%)
>> > >>>> >> >> >              2497 active+clean
>> > >>>> >> >> >              779  active+remapped+backfill_wait
>> > >>>> >> >> >              538  active+remapped+backfill_wait+backfill_toofull
>> > >>>> >> >> >              186  active+remapped+backfilling
>> > >>>> >> >> >              32   active+remapped+backfill_toofull
>> > >>>> >> >> >
>> > >>>> >> >> >   io:
>> > >>>> >> >> >     client:   27 MiB/s rd, 69 MiB/s wr, 497 op/s rd, 153 op/s wr
>> > >>>> >> >> >     recovery: 1.5 GiB/s, 922 objects/s
>> > >>>> >> >> >
>> > >>>> >> >> > Am Di., 16. März 2021 um 09:34 Uhr schrieb Boris Behrens <
>> > >>>> bb@xxxxxxxxx>:
>> > >>>> >> >> >>
>> > >>>> >> >> >> Hi Dan,
>> > >>>> >> >> >>
>> > >>>> >> >> >> my EC profile look very "default" to me.
>> > >>>> >> >> >> [root@s3db1 ~]# ceph osd erasure-code-profile ls
>> > >>>> >> >> >> default
>> > >>>> >> >> >> [root@s3db1 ~]# ceph osd erasure-code-profile get default
>> > >>>> >> >> >> k=2
>> > >>>> >> >> >> m=1
>> > >>>> >> >> >> plugin=jerasure
>> > >>>> >> >> >> technique=reed_sol_van
>> > >>>> >> >> >>
>> > >>>> >> >> >> I don't understand the ouput, but the balancing get worse over
>> > >>>> night:
>> > >>>> >> >> >>
>> > >>>> >> >> >> [root@s3db1 ~]# ceph-scripts/tools/ceph-pool-pg-distribution
>> > >>>> 11
>> > >>>> >> >> >> Searching for PGs in pools: ['11']
>> > >>>> >> >> >> Summary: 1024 PGs on 84 osds
>> > >>>> >> >> >>
>> > >>>> >> >> >> Num OSDs with X PGs:
>> > >>>> >> >> >> 15: 8
>> > >>>> >> >> >> 16: 7
>> > >>>> >> >> >> 17: 6
>> > >>>> >> >> >> 18: 10
>> > >>>> >> >> >> 19: 1
>> > >>>> >> >> >> 32: 10
>> > >>>> >> >> >> 33: 4
>> > >>>> >> >> >> 34: 6
>> > >>>> >> >> >> 35: 8
>> > >>>> >> >> >> 65: 5
>> > >>>> >> >> >> 66: 5
>> > >>>> >> >> >> 67: 4
>> > >>>> >> >> >> 68: 10
>> > >>>> >> >> >> [root@s3db1 ~]# ceph-scripts/tools/ceph-pg-histogram
>> > >>>> --normalize --pool=11
>> > >>>> >> >> >> # NumSamples = 84; Min = 4.12; Max = 5.09
>> > >>>> >> >> >> # Mean = 4.553355; Variance = 0.052415; SD = 0.228942; Median
>> > >>>> 4.561608
>> > >>>> >> >> >> # each ∎ represents a count of 1
>> > >>>> >> >> >>     4.1244 -     4.2205 [     8]: ∎∎∎∎∎∎∎∎
>> > >>>> >> >> >>     4.2205 -     4.3166 [     6]: ∎∎∎∎∎∎
>> > >>>> >> >> >>     4.3166 -     4.4127 [    11]: ∎∎∎∎∎∎∎∎∎∎∎
>> > >>>> >> >> >>     4.4127 -     4.5087 [    10]: ∎∎∎∎∎∎∎∎∎∎
>> > >>>> >> >> >>     4.5087 -     4.6048 [    11]: ∎∎∎∎∎∎∎∎∎∎∎
>> > >>>> >> >> >>     4.6048 -     4.7009 [    19]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
>> > >>>> >> >> >>     4.7009 -     4.7970 [     6]: ∎∎∎∎∎∎
>> > >>>> >> >> >>     4.7970 -     4.8931 [     8]: ∎∎∎∎∎∎∎∎
>> > >>>> >> >> >>     4.8931 -     4.9892 [     4]: ∎∎∎∎
>> > >>>> >> >> >>     4.9892 -     5.0852 [     1]: ∎
>> > >>>> >> >> >> [root@s3db1 ~]# ceph osd df tree | sort -nk 17 | tail
>> > >>>> >> >> >>  14   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 724 GiB   19 GiB
>> > >>>>    0 B 724 GiB 80.56 1.07  56     up         osd.14
>> > >>>> >> >> >>  19   hdd   3.68750  1.00000 3.7 TiB 3.0 TiB 2.9 TiB  466 MiB
>> > >>>> 7.9 GiB 708 GiB 81.25 1.08  53     up         osd.19
>> > >>>> >> >> >>   4   hdd   3.63689  1.00000 3.6 TiB 3.0 TiB 698 GiB  703 MiB
>> > >>>>    0 B 698 GiB 81.27 1.08  48     up         osd.4
>> > >>>> >> >> >>  24   hdd   3.63689  1.00000 3.6 TiB 3.0 TiB 695 GiB  640 MiB
>> > >>>>    0 B 695 GiB 81.34 1.08  46     up         osd.24
>> > >>>> >> >> >>  75   hdd   3.68750  1.00000 3.7 TiB 3.0 TiB 2.9 TiB  440 MiB
>> > >>>> 8.1 GiB 704 GiB 81.35 1.08  48     up         osd.75
>> > >>>> >> >> >>  71   hdd   3.68750  1.00000 3.7 TiB 3.0 TiB 3.0 TiB  7.5 MiB
>> > >>>> 8.0 GiB 663 GiB 82.44 1.09  47     up         osd.71
>> > >>>> >> >> >>  76   hdd   3.68750  1.00000 3.7 TiB 3.1 TiB 3.0 TiB  251 MiB
>> > >>>> 9.0 GiB 617 GiB 83.65 1.11  50     up         osd.76
>> > >>>> >> >> >>  33   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB 3.0 TiB  399 MiB
>> > >>>> 8.1 GiB 618 GiB 83.85 1.11  55     up         osd.33
>> > >>>> >> >> >>  35   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB 3.0 TiB  317 MiB
>> > >>>> 8.8 GiB 617 GiB 83.87 1.11  50     up         osd.35
>> > >>>> >> >> >>  34   hdd   3.73630  1.00000 3.7 TiB 3.2 TiB 3.1 TiB  451 MiB
>> > >>>> 8.7 GiB 545 GiB 85.75 1.14  54     up         osd.34
>> > >>>> >> >> >>
>> > >>>> >> >> >> Am Mo., 15. März 2021 um 17:23 Uhr schrieb Dan van der Ster <
>> > >>>> dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> Hi,
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> How wide are your EC profiles? If they are really wide, you
>> > >>>> might be
>> > >>>> >> >> >>> reaching the limits of what is physically possible. Also, I'm
>> > >>>> not sure
>> > >>>> >> >> >>> that upmap in 14.2.11 is very smart about *improving*
>> > >>>> existing upmap
>> > >>>> >> >> >>> rules for a given PG, in the case that a PG already has an
>> > >>>> upmap-items
>> > >>>> >> >> >>> entry but it would help the distribution to add more mapping
>> > >>>> pairs to
>> > >>>> >> >> >>> that entry. What this means, is that it might sometimes be
>> > >>>> useful to
>> > >>>> >> >> >>> randomly remove some upmap entries and see if the balancer
>> > >>>> does a
>> > >>>> >> >> >>> better job when it replaces them.
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> But before you do that, I re-remembered that looking at the
>> > >>>> total PG
>> > >>>> >> >> >>> numbers is not useful -- you need to check the PGs per OSD
>> > >>>> for the
>> > >>>> >> >> >>> eu-central-1.rgw.buckets.data pool only.
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> We have a couple tools that can help with this:
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> 1. To see the PGs per OSD for a given pool:
>> > >>>> >> >> >>>
>> > >>>> https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-pool-pg-distribution
>> > >>>> >> >> >>>
>> > >>>> >> >> >>>     E.g.: ./ceph-pool-pg-distribution 11  # to see the
>> > >>>> distribution of
>> > >>>> >> >> >>> your eu-central-1.rgw.buckets.data pool.
>> > >>>> >> >> >>>
>> > >>>> >> >> >>>     The output looks like this on my well balanced clusters:
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> # ceph-scripts/tools/ceph-pool-pg-distribution 15
>> > >>>> >> >> >>> Searching for PGs in pools: ['15']
>> > >>>> >> >> >>> Summary: 256 pgs on 56 osds
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> Num OSDs with X PGs:
>> > >>>> >> >> >>>  13: 16
>> > >>>> >> >> >>>  14: 40
>> > >>>> >> >> >>>
>> > >>>> >> >> >>>     You should expect a trimodal for your cluster.
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> 2. You can also use another script from that repo to see the
>> > >>>> PGs per
>> > >>>> >> >> >>> OSD normalized to crush weight:
>> > >>>> >> >> >>>     ceph-scripts/tools/ceph-pg-histogram --normalize --pool=15
>> > >>>> >> >> >>>
>> > >>>> >> >> >>>    This might explain what is going wrong.
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> Cheers, Dan
>> > >>>> >> >> >>>
>> > >>>> >> >> >>>
>> > >>>> >> >> >>> On Mon, Mar 15, 2021 at 3:04 PM Boris Behrens <bb@xxxxxxxxx>
>> > >>>> wrote:
>> > >>>> >> >> >>> >
>> > >>>> >> >> >>> > Absolutly:
>> > >>>> >> >> >>> > [root@s3db1 ~]# ceph osd df tree
>> > >>>> >> >> >>> > ID  CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA     OMAP
>> > >>>>    META    AVAIL    %USE  VAR  PGS STATUS TYPE NAME
>> > >>>> >> >> >>> >  -1       673.54224        - 674 TiB 496 TiB  468 TiB   97
>> > >>>> GiB 1.2 TiB  177 TiB 73.67 1.00   -        root default
>> > >>>> >> >> >>> >  -2        58.30331        -  58 TiB  42 TiB   38 TiB  9.2
>> > >>>> GiB  99 GiB   16 TiB 72.88 0.99   -            host s3db1
>> > >>>> >> >> >>> >  23   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  714
>> > >>>> MiB  25 GiB  3.7 TiB 74.87 1.02 194     up         osd.23
>> > >>>> >> >> >>> >  69   hdd  14.55269  1.00000  15 TiB  11 TiB   11 TiB  1.6
>> > >>>> GiB  40 GiB  3.4 TiB 76.32 1.04 199     up         osd.69
>> > >>>> >> >> >>> >  73   hdd  14.55269  1.00000  15 TiB  11 TiB   11 TiB  1.3
>> > >>>> GiB  34 GiB  3.8 TiB 74.15 1.01 203     up         osd.73
>> > >>>> >> >> >>> >  79   hdd   3.63689  1.00000 3.6 TiB 2.4 TiB  1.3 TiB  1.8
>> > >>>> GiB     0 B  1.3 TiB 65.44 0.89  47     up         osd.79
>> > >>>> >> >> >>> >  80   hdd   3.63689  1.00000 3.6 TiB 2.4 TiB  1.3 TiB  2.2
>> > >>>> GiB     0 B  1.3 TiB 65.34 0.89  48     up         osd.80
>> > >>>> >> >> >>> >  81   hdd   3.63689  1.00000 3.6 TiB 2.4 TiB  1.3 TiB  1.1
>> > >>>> GiB     0 B  1.3 TiB 65.38 0.89  47     up         osd.81
>> > >>>> >> >> >>> >  82   hdd   3.63689  1.00000 3.6 TiB 2.5 TiB  1.1 TiB  619
>> > >>>> MiB     0 B  1.1 TiB 68.46 0.93  41     up         osd.82
>> > >>>> >> >> >>> > -11        50.94173        -  51 TiB  37 TiB   37 TiB  3.5
>> > >>>> GiB  98 GiB   14 TiB 71.90 0.98   -            host s3db10
>> > >>>> >> >> >>> >  63   hdd   7.27739  1.00000 7.3 TiB 5.3 TiB  5.3 TiB  647
>> > >>>> MiB  14 GiB  2.0 TiB 72.72 0.99  94     up         osd.63
>> > >>>> >> >> >>> >  64   hdd   7.27739  1.00000 7.3 TiB 5.3 TiB  5.2 TiB  668
>> > >>>> MiB  14 GiB  2.0 TiB 72.23 0.98  93     up         osd.64
>> > >>>> >> >> >>> >  65   hdd   7.27739  1.00000 7.3 TiB 5.2 TiB  5.2 TiB  227
>> > >>>> MiB  14 GiB  2.1 TiB 71.16 0.97 100     up         osd.65
>> > >>>> >> >> >>> >  66   hdd   7.27739  1.00000 7.3 TiB 5.4 TiB  5.4 TiB  313
>> > >>>> MiB  14 GiB  1.9 TiB 74.25 1.01  92     up         osd.66
>> > >>>> >> >> >>> >  67   hdd   7.27739  1.00000 7.3 TiB 5.1 TiB  5.1 TiB  584
>> > >>>> MiB  14 GiB  2.1 TiB 70.63 0.96  96     up         osd.67
>> > >>>> >> >> >>> >  68   hdd   7.27739  1.00000 7.3 TiB 5.2 TiB  5.2 TiB  720
>> > >>>> MiB  14 GiB  2.1 TiB 71.72 0.97 101     up         osd.68
>> > >>>> >> >> >>> >  70   hdd   7.27739  1.00000 7.3 TiB 5.1 TiB  5.1 TiB  425
>> > >>>> MiB  14 GiB  2.1 TiB 70.59 0.96  97     up         osd.70
>> > >>>> >> >> >>> > -12        50.99052        -  51 TiB  38 TiB   37 TiB  2.1
>> > >>>> GiB  97 GiB   13 TiB 73.77 1.00   -            host s3db11
>> > >>>> >> >> >>> >  46   hdd   7.27739  1.00000 7.3 TiB 5.6 TiB  5.6 TiB  229
>> > >>>> MiB  14 GiB  1.7 TiB 77.05 1.05  97     up         osd.46
>> > >>>> >> >> >>> >  47   hdd   7.27739  1.00000 7.3 TiB 5.1 TiB  5.1 TiB  159
>> > >>>> MiB  13 GiB  2.2 TiB 70.00 0.95  89     up         osd.47
>> > >>>> >> >> >>> >  48   hdd   7.27739  1.00000 7.3 TiB 5.2 TiB  5.2 TiB  279
>> > >>>> MiB  14 GiB  2.1 TiB 71.82 0.97  98     up         osd.48
>> > >>>> >> >> >>> >  49   hdd   7.27739  1.00000 7.3 TiB 5.5 TiB  5.4 TiB  276
>> > >>>> MiB  14 GiB  1.8 TiB 74.90 1.02  95     up         osd.49
>> > >>>> >> >> >>> >  50   hdd   7.27739  1.00000 7.3 TiB 5.2 TiB  5.2 TiB  336
>> > >>>> MiB  14 GiB  2.0 TiB 72.13 0.98  93     up         osd.50
>> > >>>> >> >> >>> >  51   hdd   7.27739  1.00000 7.3 TiB 5.7 TiB  5.6 TiB  728
>> > >>>> MiB  15 GiB  1.6 TiB 77.76 1.06  98     up         osd.51
>> > >>>> >> >> >>> >  72   hdd   7.32619  1.00000 7.3 TiB 5.3 TiB  5.3 TiB  147
>> > >>>> MiB  13 GiB  2.0 TiB 72.75 0.99  95     up         osd.72
>> > >>>> >> >> >>> > -37        58.55478        -  59 TiB  44 TiB   44 TiB  4.4
>> > >>>> GiB 122 GiB   15 TiB 75.20 1.02   -            host s3db12
>> > >>>> >> >> >>> >  19   hdd   3.68750  1.00000 3.7 TiB 2.9 TiB  2.9 TiB  454
>> > >>>> MiB 8.2 GiB  780 GiB 79.35 1.08  53     up         osd.19
>> > >>>> >> >> >>> >  71   hdd   3.68750  1.00000 3.7 TiB 3.0 TiB  2.9 TiB  7.1
>> > >>>> MiB 8.0 GiB  734 GiB 80.56 1.09  47     up         osd.71
>> > >>>> >> >> >>> >  75   hdd   3.68750  1.00000 3.7 TiB 2.9 TiB  2.9 TiB  439
>> > >>>> MiB 8.2 GiB  777 GiB 79.43 1.08  48     up         osd.75
>> > >>>> >> >> >>> >  76   hdd   3.68750  1.00000 3.7 TiB 3.0 TiB  3.0 TiB  241
>> > >>>> MiB 8.9 GiB  688 GiB 81.77 1.11  50     up         osd.76
>> > >>>> >> >> >>> >  77   hdd  14.60159  1.00000  15 TiB  11 TiB   11 TiB  880
>> > >>>> MiB  30 GiB  3.6 TiB 75.44 1.02 201     up         osd.77
>> > >>>> >> >> >>> >  78   hdd  14.60159  1.00000  15 TiB  10 TiB   10 TiB 1015
>> > >>>> MiB  28 GiB  4.2 TiB 71.26 0.97 193     up         osd.78
>> > >>>> >> >> >>> >  83   hdd  14.60159  1.00000  15 TiB  11 TiB   11 TiB  1.4
>> > >>>> GiB  30 GiB  3.8 TiB 73.76 1.00 203     up         osd.83
>> > >>>> >> >> >>> >  -3        58.49872        -  58 TiB  42 TiB   36 TiB  8.2
>> > >>>> GiB  89 GiB   17 TiB 71.71 0.97   -            host s3db2
>> > >>>> >> >> >>> >   1   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  3.2
>> > >>>> GiB  37 GiB  3.7 TiB 74.58 1.01 196     up         osd.1
>> > >>>> >> >> >>> >   3   hdd   3.63689  1.00000 3.6 TiB 2.3 TiB  1.3 TiB  566
>> > >>>> MiB     0 B  1.3 TiB 64.11 0.87  50     up         osd.3
>> > >>>> >> >> >>> >   4   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB  771 GiB  695
>> > >>>> MiB     0 B  771 GiB 79.30 1.08  48     up         osd.4
>> > >>>> >> >> >>> >   5   hdd   3.63689  1.00000 3.6 TiB 2.4 TiB  1.2 TiB  482
>> > >>>> MiB     0 B  1.2 TiB 66.51 0.90  49     up         osd.5
>> > >>>> >> >> >>> >   6   hdd   3.63689  1.00000 3.6 TiB 2.3 TiB  1.3 TiB  1.8
>> > >>>> GiB     0 B  1.3 TiB 64.00 0.87  42     up         osd.6
>> > >>>> >> >> >>> >   7   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  639
>> > >>>> MiB  26 GiB  4.0 TiB 72.44 0.98 192     up         osd.7
>> > >>>> >> >> >>> >  74   hdd  14.65039  1.00000  15 TiB  10 TiB   10 TiB  907
>> > >>>> MiB  26 GiB  4.2 TiB 71.32 0.97 193     up         osd.74
>> > >>>> >> >> >>> >  -4        58.49872        -  58 TiB  43 TiB   36 TiB   34
>> > >>>> GiB  85 GiB   16 TiB 72.69 0.99   -            host s3db3
>> > >>>> >> >> >>> >   2   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  980
>> > >>>> MiB  26 GiB  3.8 TiB 74.36 1.01 203     up         osd.2
>> > >>>> >> >> >>> >   9   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  8.4
>> > >>>> GiB  33 GiB  3.9 TiB 73.51 1.00 186     up         osd.9
>> > >>>> >> >> >>> >  10   hdd  14.65039  1.00000  15 TiB  10 TiB   10 TiB  650
>> > >>>> MiB  26 GiB  4.2 TiB 71.64 0.97 201     up         osd.10
>> > >>>> >> >> >>> >  12   hdd   3.63689  1.00000 3.6 TiB 2.3 TiB  1.3 TiB  754
>> > >>>> MiB     0 B  1.3 TiB 64.17 0.87  44     up         osd.12
>> > >>>> >> >> >>> >  13   hdd   3.63689  1.00000 3.6 TiB 2.8 TiB  813 GiB  2.4
>> > >>>> GiB     0 B  813 GiB 78.17 1.06  58     up         osd.13
>> > >>>> >> >> >>> >  14   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB  797 GiB   19
>> > >>>> GiB     0 B  797 GiB 78.60 1.07  56     up         osd.14
>> > >>>> >> >> >>> >  15   hdd   3.63689  1.00000 3.6 TiB 2.3 TiB  1.3 TiB  2.2
>> > >>>> GiB     0 B  1.3 TiB 63.96 0.87  41     up         osd.15
>> > >>>> >> >> >>> >  -5        58.49872        -  58 TiB  43 TiB   36 TiB  6.7
>> > >>>> GiB  97 GiB   15 TiB 74.04 1.01   -            host s3db4
>> > >>>> >> >> >>> >  11   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  940
>> > >>>> MiB  26 GiB  4.0 TiB 72.49 0.98 196     up         osd.11
>> > >>>> >> >> >>> >  17   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB 1022
>> > >>>> MiB  26 GiB  3.6 TiB 75.23 1.02 204     up         osd.17
>> > >>>> >> >> >>> >  18   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  945
>> > >>>> MiB  45 GiB  3.8 TiB 74.16 1.01 193     up         osd.18
>> > >>>> >> >> >>> >  20   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1020 GiB  596
>> > >>>> MiB     0 B 1020 GiB 72.62 0.99  57     up         osd.20
>> > >>>> >> >> >>> >  21   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1023 GiB  1.9
>> > >>>> GiB     0 B 1023 GiB 72.54 0.98  41     up         osd.21
>> > >>>> >> >> >>> >  22   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1023 GiB  797
>> > >>>> MiB     0 B 1023 GiB 72.54 0.98  53     up         osd.22
>> > >>>> >> >> >>> >  24   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB  766 GiB  618
>> > >>>> MiB     0 B  766 GiB 79.42 1.08  46     up         osd.24
>> > >>>> >> >> >>> >  -6        58.89636        -  59 TiB  43 TiB   43 TiB  3.0
>> > >>>> GiB 108 GiB   16 TiB 73.40 1.00   -            host s3db5
>> > >>>> >> >> >>> >   0   hdd   3.73630  1.00000 3.7 TiB 2.7 TiB  2.6 TiB   92
>> > >>>> MiB 7.2 GiB  1.1 TiB 71.16 0.97  45     up         osd.0
>> > >>>> >> >> >>> >  25   hdd   3.73630  1.00000 3.7 TiB 2.7 TiB  2.6 TiB  2.4
>> > >>>> MiB 7.3 GiB  1.1 TiB 71.23 0.97  41     up         osd.25
>> > >>>> >> >> >>> >  26   hdd   3.73630  1.00000 3.7 TiB 2.8 TiB  2.7 TiB  181
>> > >>>> MiB 7.6 GiB  935 GiB 75.57 1.03  45     up         osd.26
>> > >>>> >> >> >>> >  27   hdd   3.73630  1.00000 3.7 TiB 2.7 TiB  2.6 TiB  5.1
>> > >>>> MiB 7.0 GiB  1.1 TiB 71.20 0.97  47     up         osd.27
>> > >>>> >> >> >>> >  28   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  977
>> > >>>> MiB  26 GiB  3.8 TiB 73.85 1.00 197     up         osd.28
>> > >>>> >> >> >>> >  29   hdd  14.65039  1.00000  15 TiB  11 TiB   10 TiB  872
>> > >>>> MiB  26 GiB  4.1 TiB 71.98 0.98 196     up         osd.29
>> > >>>> >> >> >>> >  30   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  943
>> > >>>> MiB  27 GiB  3.6 TiB 75.51 1.03 202     up         osd.30
>> > >>>> >> >> >>> >  -7        58.89636        -  59 TiB  44 TiB   43 TiB   13
>> > >>>> GiB 122 GiB   15 TiB 74.97 1.02   -            host s3db6
>> > >>>> >> >> >>> >  32   hdd   3.73630  1.00000 3.7 TiB 2.8 TiB  2.7 TiB   27
>> > >>>> MiB 7.6 GiB  940 GiB 75.42 1.02  55     up         osd.32
>> > >>>> >> >> >>> >  33   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB  3.0 TiB  376
>> > >>>> MiB 8.2 GiB  691 GiB 81.94 1.11  55     up         osd.33
>> > >>>> >> >> >>> >  34   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB  3.0 TiB  450
>> > >>>> MiB 8.5 GiB  620 GiB 83.79 1.14  54     up         osd.34
>> > >>>> >> >> >>> >  35   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB  3.0 TiB  316
>> > >>>> MiB 8.4 GiB  690 GiB 81.98 1.11  50     up         osd.35
>> > >>>> >> >> >>> >  36   hdd  14.65039  1.00000  15 TiB  11 TiB   10 TiB  489
>> > >>>> MiB  25 GiB  4.1 TiB 71.69 0.97 208     up         osd.36
>> > >>>> >> >> >>> >  37   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB   11
>> > >>>> GiB  38 GiB  4.0 TiB 72.41 0.98 195     up         osd.37
>> > >>>> >> >> >>> >  38   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  1.1
>> > >>>> GiB  26 GiB  3.7 TiB 74.88 1.02 204     up         osd.38
>> > >>>> >> >> >>> >  -8        58.89636        -  59 TiB  44 TiB   43 TiB  3.8
>> > >>>> GiB 111 GiB   15 TiB 74.16 1.01   -            host s3db7
>> > >>>> >> >> >>> >  39   hdd   3.73630  1.00000 3.7 TiB 2.8 TiB  2.7 TiB   19
>> > >>>> MiB 7.5 GiB  936 GiB 75.54 1.03  39     up         osd.39
>> > >>>> >> >> >>> >  40   hdd   3.73630  1.00000 3.7 TiB 2.6 TiB  2.5 TiB  144
>> > >>>> MiB 7.1 GiB  1.1 TiB 69.87 0.95  39     up         osd.40
>> > >>>> >> >> >>> >  41   hdd   3.73630  1.00000 3.7 TiB 2.7 TiB  2.7 TiB  219
>> > >>>> MiB 7.6 GiB 1011 GiB 73.57 1.00  55     up         osd.41
>> > >>>> >> >> >>> >  42   hdd   3.73630  1.00000 3.7 TiB 2.6 TiB  2.5 TiB  593
>> > >>>> MiB 7.1 GiB  1.1 TiB 70.02 0.95  47     up         osd.42
>> > >>>> >> >> >>> >  43   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  500
>> > >>>> MiB  27 GiB  3.7 TiB 74.67 1.01 204     up         osd.43
>> > >>>> >> >> >>> >  44   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  1.1
>> > >>>> GiB  27 GiB  3.7 TiB 74.62 1.01 193     up         osd.44
>> > >>>> >> >> >>> >  45   hdd  14.65039  1.00000  15 TiB  11 TiB   11 TiB  1.2
>> > >>>> GiB  29 GiB  3.6 TiB 75.16 1.02 204     up         osd.45
>> > >>>> >> >> >>> >  -9        51.28331        -  51 TiB  39 TiB   39 TiB  4.9
>> > >>>> GiB 107 GiB   12 TiB 76.50 1.04   -            host s3db8
>> > >>>> >> >> >>> >   8   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB  5.5 TiB  474
>> > >>>> MiB  14 GiB  1.7 TiB 76.37 1.04  98     up         osd.8
>> > >>>> >> >> >>> >  16   hdd   7.32619  1.00000 7.3 TiB 5.7 TiB  5.7 TiB  783
>> > >>>> MiB  15 GiB  1.6 TiB 78.39 1.06 100     up         osd.16
>> > >>>> >> >> >>> >  31   hdd   7.32619  1.00000 7.3 TiB 5.7 TiB  5.6 TiB  441
>> > >>>> MiB  14 GiB  1.6 TiB 77.70 1.05  91     up         osd.31
>> > >>>> >> >> >>> >  52   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB  5.5 TiB  939
>> > >>>> MiB  14 GiB  1.7 TiB 76.29 1.04 102     up         osd.52
>> > >>>> >> >> >>> >  53   hdd   7.32619  1.00000 7.3 TiB 5.4 TiB  5.4 TiB  848
>> > >>>> MiB  18 GiB  1.9 TiB 74.30 1.01  98     up         osd.53
>> > >>>> >> >> >>> >  54   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB  5.6 TiB  1.0
>> > >>>> GiB  16 GiB  1.7 TiB 76.99 1.05 106     up         osd.54
>> > >>>> >> >> >>> >  55   hdd   7.32619  1.00000 7.3 TiB 5.5 TiB  5.5 TiB  460
>> > >>>> MiB  15 GiB  1.8 TiB 75.46 1.02 105     up         osd.55
>> > >>>> >> >> >>> > -10        51.28331        -  51 TiB  37 TiB   37 TiB  3.8
>> > >>>> GiB  96 GiB   14 TiB 72.77 0.99   -            host s3db9
>> > >>>> >> >> >>> >  56   hdd   7.32619  1.00000 7.3 TiB 5.2 TiB  5.2 TiB  846
>> > >>>> MiB  13 GiB  2.1 TiB 71.16 0.97 104     up         osd.56
>> > >>>> >> >> >>> >  57   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB  5.6 TiB  513
>> > >>>> MiB  15 GiB  1.7 TiB 76.53 1.04  96     up         osd.57
>> > >>>> >> >> >>> >  58   hdd   7.32619  1.00000 7.3 TiB 5.2 TiB  5.2 TiB  604
>> > >>>> MiB  13 GiB  2.1 TiB 71.23 0.97  98     up         osd.58
>> > >>>> >> >> >>> >  59   hdd   7.32619  1.00000 7.3 TiB 5.1 TiB  5.1 TiB  414
>> > >>>> MiB  13 GiB  2.2 TiB 70.03 0.95  88     up         osd.59
>> > >>>> >> >> >>> >  60   hdd   7.32619  1.00000 7.3 TiB 5.5 TiB  5.5 TiB  227
>> > >>>> MiB  14 GiB  1.8 TiB 75.54 1.03  97     up         osd.60
>> > >>>> >> >> >>> >  61   hdd   7.32619  1.00000 7.3 TiB 5.1 TiB  5.1 TiB  456
>> > >>>> MiB  13 GiB  2.2 TiB 70.01 0.95  95     up         osd.61
>> > >>>> >> >> >>> >  62   hdd   7.32619  1.00000 7.3 TiB 5.5 TiB  5.4 TiB  843
>> > >>>> MiB  14 GiB  1.8 TiB 74.93 1.02 110     up         osd.62
>> > >>>> >> >> >>> >                        TOTAL 674 TiB 496 TiB  468 TiB   97
>> > >>>> GiB 1.2 TiB  177 TiB 73.67
>> > >>>> >> >> >>> > MIN/MAX VAR: 0.87/1.14  STDDEV: 4.22
>> > >>>> >> >> >>> >
>> > >>>> >> >> >>> > Am Mo., 15. März 2021 um 15:02 Uhr schrieb Dan van der Ster
>> > >>>> <dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >> >>> >>
>> > >>>> >> >> >>> >> OK thanks. Indeed "prepared 0/10 changes" means it thinks
>> > >>>> things are balanced.
>> > >>>> >> >> >>> >> Could you again share the full ceph osd df tree?
>> > >>>> >> >> >>> >>
>> > >>>> >> >> >>> >> On Mon, Mar 15, 2021 at 2:54 PM Boris Behrens <
>> > >>>> bb@xxxxxxxxx> wrote:
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> > Hi Dan,
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> > I've set the autoscaler to warn, but it actually does
>> > >>>> not warn for now. So not touching it for now.
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> > this is what the log says in minute intervals:
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/active
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/sleep_interval
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/begin_time
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/end_time
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/begin_weekday
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/end_weekday
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:00.971 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/pool_ids
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr[balancer]
>> > >>>> Optimize plan auto_2021-03-15_13:51:00
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/mode
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr[balancer]
>> > >>>> Mode upmap, max misplaced 0.050000
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr[balancer]
>> > >>>> do_upmap
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/upmap_max_iterations
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr get_config
>> > >>>> get_config key: mgr/balancer/upmap_max_deviation
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700  4 mgr[balancer]
>> > >>>> pools ['eu-msg-1.rgw.data.root', 'eu-msg-1.rgw.buckets.non-ec',
>> > >>>> 'eu-central-1.rgw.users.keys', 'eu-central-1.rgw.gc',
>> > >>>> 'eu-central-1.rgw.buckets.data', 'eu-central-1.rgw.users.email',
>> > >>>> 'eu-msg-1.rgw.gc', 'eu-central-1.rgw.usage', 'eu-msg-1.rgw.users.keys',
>> > >>>> 'eu-central-1.rgw.buckets.index', 'rbd', 'eu-msg-1.rgw.log',
>> > >>>> 'whitespace-again-2021-03-10_2', 'eu-msg-1.rgw.buckets.index',
>> > >>>> 'eu-msg-1.rgw.meta', 'eu-central-1.rgw.log', 'default.rgw.gc',
>> > >>>> 'eu-central-1.rgw.buckets.non-ec', 'eu-msg-1.rgw.usage',
>> > >>>> 'whitespace-again-2021-03-10', 'fra-1.rgw.meta',
>> > >>>> 'eu-central-1.rgw.users.uid', 'eu-msg-1.rgw.users.email',
>> > >>>> 'fra-1.rgw.control', 'eu-msg-1.rgw.users.uid', 'eu-msg-1.rgw.control',
>> > >>>> '.rgw.root', 'eu-msg-1.rgw.buckets.data', 'default.rgw.control',
>> > >>>> 'fra-1.rgw.log', 'default.rgw.data.root', 'whitespace-again-2021-03-10_3',
>> > >>>> 'default.rgw.log', 'eu-central-1.rgw.meta', 'eu-central-1.rgw.data.root',
>> > >>>> 'default.rgw.users.uid', 'eu-central-1.rgw.control']
>> > >>>> >> >> >>> >> > 2021-03-15 13:51:01.224 7f307d5fd700  4 mgr[balancer]
>> > >>>> prepared 0/10 changes
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> > Am Mo., 15. März 2021 um 14:15 Uhr schrieb Dan van der
>> > >>>> Ster <dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >> >>> >> >>
>> > >>>> >> >> >>> >> >> I suggest to just disable the autoscaler until your
>> > >>>> balancing is understood.
>> > >>>> >> >> >>> >> >>
>> > >>>> >> >> >>> >> >> What does your active mgr log say (with debug_mgr 4/5),
>> > >>>> grep balancer
>> > >>>> >> >> >>> >> >> /var/log/ceph/ceph-mgr.*.log
>> > >>>> >> >> >>> >> >>
>> > >>>> >> >> >>> >> >> -- Dan
>> > >>>> >> >> >>> >> >>
>> > >>>> >> >> >>> >> >> On Mon, Mar 15, 2021 at 1:47 PM Boris Behrens <
>> > >>>> bb@xxxxxxxxx> wrote:
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> > Hi,
>> > >>>> >> >> >>> >> >> > this unfortunally did not solve my problem. I still
>> > >>>> have some OSDs that fill up to 85%
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> > According to the logging, the autoscaler might want
>> > >>>> to add more PGs to one Bucken and reduce almost all other buckets to 32.
>> > >>>> >> >> >>> >> >> > 2021-03-15 12:19:58.825 7f307f601700  4
>> > >>>> mgr[pg_autoscaler] Pool 'eu-central-1.rgw.buckets.data' root_id -1 using
>> > >>>> 0.705080476146 of space, bias 1.0, pg target 1974.22533321 quantized to
>> > >>>> 2048 (current 1024)
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> > Why the balancing does not happen is still nebulous
>> > >>>> to me.
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> > Am Sa., 13. März 2021 um 16:37 Uhr schrieb Dan van
>> > >>>> der Ster <dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >> >>> >> >> >>
>> > >>>> >> >> >>> >> >> >> OK
>> > >>>> >> >> >>> >> >> >> Btw, you might need to fail to a new mgr... I'm not
>> > >>>> sure if the current active will read that new config.
>> > >>>> >> >> >>> >> >> >>
>> > >>>> >> >> >>> >> >> >> .. dan
>> > >>>> >> >> >>> >> >> >>
>> > >>>> >> >> >>> >> >> >>
>> > >>>> >> >> >>> >> >> >> On Sat, Mar 13, 2021, 4:36 PM Boris Behrens <
>> > >>>> bb@xxxxxxxxx> wrote:
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>> Hi,
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>> ok thanks. I just changed the value and rewighted
>> > >>>> everything back to 1. Now I let it sync the weekend and check how it will
>> > >>>> be on monday.
>> > >>>> >> >> >>> >> >> >>> We tried to have the systems total storage balanced
>> > >>>> as possible. New systems will be with 8TB disks but for the exiting ones we
>> > >>>> added 16TB to offset the 4TB disks and we needed a lot of storage fast,
>> > >>>> because of a DC move. If you have any recommendations I would be happy to
>> > >>>> hear them.
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>> Cheers
>> > >>>> >> >> >>> >> >> >>>  Boris
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>> Am Sa., 13. März 2021 um 16:20 Uhr schrieb Dan van
>> > >>>> der Ster <dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>> Thanks.
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>> Decreasing the max deviation to 2 or 1 should help
>> > >>>> in your case. This option controls when the balancer stops trying to move
>> > >>>> PGs around -- by default it stops when the deviation from the mean is 5.
>> > >>>> Yes this is too large IMO -- all of our clusters have this set to 1.
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>> And given that you have some OSDs with more than
>> > >>>> 200 PGs, you definitely shouldn't increase the num PGs.
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>> But anyway with your mixed device sizes it might
>> > >>>> be challenging to make a perfectly uniform distribution. Give it a try with
>> > >>>> 1 though, and let us know how it goes.
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>> .. Dan
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>>
>> > >>>> >> >> >>> >> >> >>>> On Sat, Mar 13, 2021, 4:11 PM Boris Behrens <
>> > >>>> bb@xxxxxxxxx> wrote:
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> Hi Dan,
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> upmap_max_deviation is default (5) in our
>> > >>>> cluster. Is 1 the recommended deviation?
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> I added the whole ceph osd df tree, (I need to
>> > >>>> remove some OSDs and readd them as bluestore with SSD, so 69, 73 and 82 are
>> > >>>> a bit off now. I also reweighted to try to get the %USE mitigated).
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> I will increase the mgr debugging to see what is
>> > >>>> the problem.
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> [root@s3db1 ~]# ceph osd df tree
>> > >>>> >> >> >>> >> >> >>>>> ID  CLASS WEIGHT    REWEIGHT SIZE    RAW USE
>> > >>>> DATA    OMAP    META    AVAIL   %USE  VAR  PGS STATUS TYPE NAME
>> > >>>> >> >> >>> >> >> >>>>>  -1       673.54224        - 659 TiB 491 TiB 464
>> > >>>> TiB  96 GiB 1.2 TiB 168 TiB 74.57 1.00   -        root default
>> > >>>> >> >> >>> >> >> >>>>>  -2        58.30331        -  44 TiB  22 TiB  17
>> > >>>> TiB 5.7 GiB  38 GiB  22 TiB 49.82 0.67   -            host s3db1
>> > >>>> >> >> >>> >> >> >>>>>  23   hdd  14.65039  1.00000  15 TiB 1.8 TiB 1.7
>> > >>>> TiB 156 MiB 4.4 GiB  13 TiB 12.50 0.17 101     up         osd.23
>> > >>>> >> >> >>> >> >> >>>>>  69   hdd  14.55269        0     0 B     0 B
>> > >>>>  0 B     0 B     0 B     0 B     0    0  11     up         osd.69
>> > >>>> >> >> >>> >> >> >>>>>  73   hdd  14.55269  1.00000  15 TiB  10 TiB  10
>> > >>>> TiB 6.1 MiB  33 GiB 4.2 TiB 71.15 0.95 107     up         osd.73
>> > >>>> >> >> >>> >> >> >>>>>  79   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 747
>> > >>>> GiB 2.0 GiB     0 B 747 GiB 79.94 1.07  52     up         osd.79
>> > >>>> >> >> >>> >> >> >>>>>  80   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1.0
>> > >>>> TiB 1.9 GiB     0 B 1.0 TiB 71.61 0.96  58     up         osd.80
>> > >>>> >> >> >>> >> >> >>>>>  81   hdd   3.63689  1.00000 3.6 TiB 2.2 TiB 1.5
>> > >>>> TiB 1.1 GiB     0 B 1.5 TiB 60.07 0.81  55     up         osd.81
>> > >>>> >> >> >>> >> >> >>>>>  82   hdd   3.63689  1.00000 3.6 TiB 1.9 TiB 1.7
>> > >>>> TiB 536 MiB     0 B 1.7 TiB 52.68 0.71  30     up         osd.82
>> > >>>> >> >> >>> >> >> >>>>> -11        50.94173        -  51 TiB  38 TiB  38
>> > >>>> TiB 3.7 GiB 100 GiB  13 TiB 74.69 1.00   -            host s3db10
>> > >>>> >> >> >>> >> >> >>>>>  63   hdd   7.27739  1.00000 7.3 TiB 5.5 TiB 5.5
>> > >>>> TiB 616 MiB  14 GiB 1.7 TiB 76.04 1.02  92     up         osd.63
>> > >>>> >> >> >>> >> >> >>>>>  64   hdd   7.27739  1.00000 7.3 TiB 5.5 TiB 5.5
>> > >>>> TiB 820 MiB  15 GiB 1.8 TiB 75.54 1.01 101     up         osd.64
>> > >>>> >> >> >>> >> >> >>>>>  65   hdd   7.27739  1.00000 7.3 TiB 5.3 TiB 5.3
>> > >>>> TiB 109 MiB  14 GiB 2.0 TiB 73.17 0.98 105     up         osd.65
>> > >>>> >> >> >>> >> >> >>>>>  66   hdd   7.27739  1.00000 7.3 TiB 5.8 TiB 5.8
>> > >>>> TiB 423 MiB  15 GiB 1.4 TiB 80.38 1.08  98     up         osd.66
>> > >>>> >> >> >>> >> >> >>>>>  67   hdd   7.27739  1.00000 7.3 TiB 5.1 TiB 5.1
>> > >>>> TiB 572 MiB  14 GiB 2.2 TiB 70.10 0.94 100     up         osd.67
>> > >>>> >> >> >>> >> >> >>>>>  68   hdd   7.27739  1.00000 7.3 TiB 5.3 TiB 5.3
>> > >>>> TiB 630 MiB  13 GiB 2.0 TiB 72.88 0.98 107     up         osd.68
>> > >>>> >> >> >>> >> >> >>>>>  70   hdd   7.27739  1.00000 7.3 TiB 5.4 TiB 5.4
>> > >>>> TiB 648 MiB  14 GiB 1.8 TiB 74.73 1.00 102     up         osd.70
>> > >>>> >> >> >>> >> >> >>>>> -12        50.99052        -  51 TiB  39 TiB  39
>> > >>>> TiB 2.9 GiB  99 GiB  12 TiB 77.24 1.04   -            host s3db11
>> > >>>> >> >> >>> >> >> >>>>>  46   hdd   7.27739  1.00000 7.3 TiB 5.7 TiB 5.7
>> > >>>> TiB 102 MiB  15 GiB 1.5 TiB 78.91 1.06  97     up         osd.46
>> > >>>> >> >> >>> >> >> >>>>>  47   hdd   7.27739  1.00000 7.3 TiB 5.2 TiB 5.2
>> > >>>> TiB  61 MiB  13 GiB 2.1 TiB 71.47 0.96  96     up         osd.47
>> > >>>> >> >> >>> >> >> >>>>>  48   hdd   7.27739  1.00000 7.3 TiB 6.1 TiB 6.1
>> > >>>> TiB 853 MiB  15 GiB 1.2 TiB 83.46 1.12 109     up         osd.48
>> > >>>> >> >> >>> >> >> >>>>>  49   hdd   7.27739  1.00000 7.3 TiB 5.7 TiB 5.7
>> > >>>> TiB 708 MiB  15 GiB 1.5 TiB 78.96 1.06  98     up         osd.49
>> > >>>> >> >> >>> >> >> >>>>>  50   hdd   7.27739  1.00000 7.3 TiB 5.9 TiB 5.8
>> > >>>> TiB 472 MiB  15 GiB 1.4 TiB 80.40 1.08 102     up         osd.50
>> > >>>> >> >> >>> >> >> >>>>>  51   hdd   7.27739  1.00000 7.3 TiB 5.9 TiB 5.9
>> > >>>> TiB 729 MiB  15 GiB 1.3 TiB 81.70 1.10 110     up         osd.51
>> > >>>> >> >> >>> >> >> >>>>>  72   hdd   7.32619  1.00000 7.3 TiB 4.8 TiB 4.8
>> > >>>> TiB  91 MiB  12 GiB 2.5 TiB 65.82 0.88  89     up         osd.72
>> > >>>> >> >> >>> >> >> >>>>> -37        58.55478        -  59 TiB  46 TiB  46
>> > >>>> TiB 5.0 GiB 124 GiB  12 TiB 79.04 1.06   -            host s3db12
>> > >>>> >> >> >>> >> >> >>>>>  19   hdd   3.68750  1.00000 3.7 TiB 3.1 TiB 3.1
>> > >>>> TiB 462 MiB 8.2 GiB 559 GiB 85.18 1.14  55     up         osd.19
>> > >>>> >> >> >>> >> >> >>>>>  71   hdd   3.68750  1.00000 3.7 TiB 2.9 TiB 2.8
>> > >>>> TiB 3.9 MiB 7.8 GiB 825 GiB 78.14 1.05  50     up         osd.71
>> > >>>> >> >> >>> >> >> >>>>>  75   hdd   3.68750  1.00000 3.7 TiB 3.1 TiB 3.1
>> > >>>> TiB 576 MiB 8.3 GiB 555 GiB 85.29 1.14  57     up         osd.75
>> > >>>> >> >> >>> >> >> >>>>>  76   hdd   3.68750  1.00000 3.7 TiB 3.2 TiB 3.1
>> > >>>> TiB 239 MiB 9.3 GiB 501 GiB 86.73 1.16  50     up         osd.76
>> > >>>> >> >> >>> >> >> >>>>>  77   hdd  14.60159  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 880 MiB  30 GiB 3.6 TiB 75.57 1.01 202     up         osd.77
>> > >>>> >> >> >>> >> >> >>>>>  78   hdd  14.60159  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 1.0 GiB  30 GiB 3.4 TiB 76.65 1.03 196     up         osd.78
>> > >>>> >> >> >>> >> >> >>>>>  83   hdd  14.60159  1.00000  15 TiB  12 TiB  12
>> > >>>> TiB 1.8 GiB  31 GiB 2.9 TiB 80.04 1.07 223     up         osd.83
>> > >>>> >> >> >>> >> >> >>>>>  -3        58.49872        -  58 TiB  43 TiB  38
>> > >>>> TiB 8.1 GiB  91 GiB  16 TiB 73.15 0.98   -            host s3db2
>> > >>>> >> >> >>> >> >> >>>>>   1   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 3.1 GiB  38 GiB 3.6 TiB 75.52 1.01 194     up         osd.1
>> > >>>> >> >> >>> >> >> >>>>>   3   hdd   3.63689  1.00000 3.6 TiB 2.2 TiB 1.4
>> > >>>> TiB 418 MiB     0 B 1.4 TiB 60.94 0.82  52     up         osd.3
>> > >>>> >> >> >>> >> >> >>>>>   4   hdd   3.63689  0.89999 3.6 TiB 3.2 TiB 401
>> > >>>> GiB 845 MiB     0 B 401 GiB 89.23 1.20  53     up         osd.4
>> > >>>> >> >> >>> >> >> >>>>>   5   hdd   3.63689  1.00000 3.6 TiB 2.3 TiB 1.3
>> > >>>> TiB 437 MiB     0 B 1.3 TiB 62.88 0.84  51     up         osd.5
>> > >>>> >> >> >>> >> >> >>>>>   6   hdd   3.63689  1.00000 3.6 TiB 2.0 TiB 1.7
>> > >>>> TiB 1.8 GiB     0 B 1.7 TiB 54.51 0.73  47     up         osd.6
>> > >>>> >> >> >>> >> >> >>>>>   7   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 493 MiB  26 GiB 3.8 TiB 73.90 0.99 185     up         osd.7
>> > >>>> >> >> >>> >> >> >>>>>  74   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 1.1 GiB  27 GiB 3.5 TiB 76.27 1.02 208     up         osd.74
>> > >>>> >> >> >>> >> >> >>>>>  -4        58.49872        -  58 TiB  43 TiB  37
>> > >>>> TiB  33 GiB  86 GiB  15 TiB 74.05 0.99   -            host s3db3
>> > >>>> >> >> >>> >> >> >>>>>   2   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 850 MiB  26 GiB 4.0 TiB 72.78 0.98 203     up         osd.2
>> > >>>> >> >> >>> >> >> >>>>>   9   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 8.3 GiB  33 GiB 3.6 TiB 75.62 1.01 189     up         osd.9
>> > >>>> >> >> >>> >> >> >>>>>  10   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 663 MiB  28 GiB 3.5 TiB 76.34 1.02 211     up         osd.10
>> > >>>> >> >> >>> >> >> >>>>>  12   hdd   3.63689  1.00000 3.6 TiB 2.4 TiB 1.2
>> > >>>> TiB 633 MiB     0 B 1.2 TiB 66.22 0.89  44     up         osd.12
>> > >>>> >> >> >>> >> >> >>>>>  13   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 720
>> > >>>> GiB 2.3 GiB     0 B 720 GiB 80.66 1.08  66     up         osd.13
>> > >>>> >> >> >>> >> >> >>>>>  14   hdd   3.63689  1.00000 3.6 TiB 3.1 TiB 552
>> > >>>> GiB  18 GiB     0 B 552 GiB 85.18 1.14  60     up         osd.14
>> > >>>> >> >> >>> >> >> >>>>>  15   hdd   3.63689  1.00000 3.6 TiB 2.0 TiB 1.7
>> > >>>> TiB 2.1 GiB     0 B 1.7 TiB 53.72 0.72  44     up         osd.15
>> > >>>> >> >> >>> >> >> >>>>>  -5        58.49872        -  58 TiB  45 TiB  37
>> > >>>> TiB 7.2 GiB  99 GiB  14 TiB 76.37 1.02   -            host s3db4
>> > >>>> >> >> >>> >> >> >>>>>  11   hdd  14.65039  1.00000  15 TiB  12 TiB  12
>> > >>>> TiB 897 MiB  28 GiB 2.8 TiB 81.15 1.09 205     up         osd.11
>> > >>>> >> >> >>> >> >> >>>>>  17   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 1.2 GiB  27 GiB 3.6 TiB 75.38 1.01 211     up         osd.17
>> > >>>> >> >> >>> >> >> >>>>>  18   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 965 MiB  44 GiB 4.0 TiB 72.86 0.98 188     up         osd.18
>> > >>>> >> >> >>> >> >> >>>>>  20   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 796
>> > >>>> GiB 529 MiB     0 B 796 GiB 78.63 1.05  66     up         osd.20
>> > >>>> >> >> >>> >> >> >>>>>  21   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1.1
>> > >>>> TiB 2.1 GiB     0 B 1.1 TiB 70.32 0.94  47     up         osd.21
>> > >>>> >> >> >>> >> >> >>>>>  22   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 802
>> > >>>> GiB 882 MiB     0 B 802 GiB 78.47 1.05  58     up         osd.22
>> > >>>> >> >> >>> >> >> >>>>>  24   hdd   3.63689  1.00000 3.6 TiB 2.8 TiB 856
>> > >>>> GiB 645 MiB     0 B 856 GiB 77.01 1.03  47     up         osd.24
>> > >>>> >> >> >>> >> >> >>>>>  -6        58.89636        -  59 TiB  44 TiB  44
>> > >>>> TiB 2.4 GiB 111 GiB  15 TiB 75.22 1.01   -            host s3db5
>> > >>>> >> >> >>> >> >> >>>>>   0   hdd   3.73630  1.00000 3.7 TiB 2.4 TiB 2.3
>> > >>>> TiB  70 MiB 6.6 GiB 1.3 TiB 65.00 0.87  48     up         osd.0
>> > >>>> >> >> >>> >> >> >>>>>  25   hdd   3.73630  1.00000 3.7 TiB 2.4 TiB 2.3
>> > >>>> TiB 5.3 MiB 6.6 GiB 1.4 TiB 63.86 0.86  41     up         osd.25
>> > >>>> >> >> >>> >> >> >>>>>  26   hdd   3.73630  1.00000 3.7 TiB 2.9 TiB 2.8
>> > >>>> TiB 181 MiB 7.6 GiB 862 GiB 77.47 1.04  48     up         osd.26
>> > >>>> >> >> >>> >> >> >>>>>  27   hdd   3.73630  1.00000 3.7 TiB 2.3 TiB 2.2
>> > >>>> TiB 7.0 MiB 6.1 GiB 1.5 TiB 61.00 0.82  48     up         osd.27
>> > >>>> >> >> >>> >> >> >>>>>  28   hdd  14.65039  1.00000  15 TiB  12 TiB  12
>> > >>>> TiB 937 MiB  30 GiB 2.8 TiB 81.19 1.09 203     up         osd.28
>> > >>>> >> >> >>> >> >> >>>>>  29   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 536 MiB  26 GiB 3.8 TiB 73.95 0.99 200     up         osd.29
>> > >>>> >> >> >>> >> >> >>>>>  30   hdd  14.65039  1.00000  15 TiB  12 TiB  11
>> > >>>> TiB 744 MiB  28 GiB 3.1 TiB 79.07 1.06 207     up         osd.30
>> > >>>> >> >> >>> >> >> >>>>>  -7        58.89636        -  59 TiB  44 TiB  44
>> > >>>> TiB  14 GiB 122 GiB  14 TiB 75.41 1.01   -            host s3db6
>> > >>>> >> >> >>> >> >> >>>>>  32   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB 3.0
>> > >>>> TiB  16 MiB 8.2 GiB 622 GiB 83.74 1.12  65     up         osd.32
>> > >>>> >> >> >>> >> >> >>>>>  33   hdd   3.73630  0.79999 3.7 TiB 3.0 TiB 2.9
>> > >>>> TiB  14 MiB 8.1 GiB 740 GiB 80.67 1.08  52     up         osd.33
>> > >>>> >> >> >>> >> >> >>>>>  34   hdd   3.73630  0.79999 3.7 TiB 2.9 TiB 2.8
>> > >>>> TiB 449 MiB 7.7 GiB 877 GiB 77.08 1.03  52     up         osd.34
>> > >>>> >> >> >>> >> >> >>>>>  35   hdd   3.73630  0.79999 3.7 TiB 2.3 TiB 2.2
>> > >>>> TiB 133 MiB 7.0 GiB 1.4 TiB 62.18 0.83  42     up         osd.35
>> > >>>> >> >> >>> >> >> >>>>>  36   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 544 MiB  26 GiB 4.0 TiB 72.98 0.98 220     up         osd.36
>> > >>>> >> >> >>> >> >> >>>>>  37   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB  11 GiB  38 GiB 3.6 TiB 75.30 1.01 200     up         osd.37
>> > >>>> >> >> >>> >> >> >>>>>  38   hdd  14.65039  1.00000  15 TiB  11 TiB  11
>> > >>>> TiB 1.2 GiB  28 GiB 3.3 TiB 77.43 1.04 217     up         osd.38
>> > >>>> >> >> >>> >> >> >>>>>  -8        58.89636        -  59 TiB  47 TiB  46
>> > >>>> TiB 3.9 GiB 116 GiB  12 TiB 78.98 1.06   -            host s3db7
>> > >>>> >> >> >>> >> >> >>>>>  39   hdd   3.73630  1.00000 3.7 TiB 3.2 TiB 3.2
>> > >>>> TiB  19 MiB 8.5 GiB 499 GiB 86.96 1.17  43     up         osd.39
>> > >>>> >> >> >>> >> >> >>>>>  40   hdd   3.73630  1.00000 3.7 TiB 2.6 TiB 2.5
>> > >>>> TiB 144 MiB 7.0 GiB 1.2 TiB 68.33 0.92  39     up         osd.40
>> > >>>> >> >> >>> >> >> >>>>>  41   hdd   3.73630  1.00000 3.7 TiB 3.0 TiB 2.9
>> > >>>> TiB 218 MiB 7.9 GiB 732 GiB 80.86 1.08  64     up         osd.41
>> > >>>> >> >> >>> >> >> >>>>>  42   hdd   3.73630  1.00000 3.7 TiB 2.5 TiB 2.4
>> > >>>> TiB 594 MiB 7.0 GiB 1.2 TiB 67.97 0.91  50     up         osd.42
>> > >>>> >> >> >>> >> >> >>>>>  43   hdd  14.65039  1.00000  15 TiB  12 TiB  12
>> > >>>> TiB 564 MiB  28 GiB 2.9 TiB 80.32 1.08 213     up         osd.43
>> > >>>> >> >> >>> >> >> >>>>>  44   hdd  14.65039  1.00000  15 TiB  12 TiB  11
>> > >>>> TiB 1.3 GiB  28 GiB 3.1 TiB 78.59 1.05 198     up         osd.44
>> > >>>> >> >> >>> >> >> >>>>>  45   hdd  14.65039  1.00000  15 TiB  12 TiB  12
>> > >>>> TiB 1.2 GiB  30 GiB 2.8 TiB 81.05 1.09 214     up         osd.45
>> > >>>> >> >> >>> >> >> >>>>>  -9        51.28331        -  51 TiB  41 TiB  41
>> > >>>> TiB 4.9 GiB 108 GiB  10 TiB 79.75 1.07   -            host s3db8
>> > >>>> >> >> >>> >> >> >>>>>   8   hdd   7.32619  1.00000 7.3 TiB 5.8 TiB 5.8
>> > >>>> TiB 472 MiB  15 GiB 1.5 TiB 79.68 1.07  99     up         osd.8
>> > >>>> >> >> >>> >> >> >>>>>  16   hdd   7.32619  1.00000 7.3 TiB 5.9 TiB 5.8
>> > >>>> TiB 785 MiB  15 GiB 1.4 TiB 80.25 1.08  97     up         osd.16
>> > >>>> >> >> >>> >> >> >>>>>  31   hdd   7.32619  1.00000 7.3 TiB 5.5 TiB 5.5
>> > >>>> TiB 438 MiB  14 GiB 1.8 TiB 75.36 1.01  87     up         osd.31
>> > >>>> >> >> >>> >> >> >>>>>  52   hdd   7.32619  1.00000 7.3 TiB 5.7 TiB 5.7
>> > >>>> TiB 844 MiB  15 GiB 1.6 TiB 78.19 1.05 113     up         osd.52
>> > >>>> >> >> >>> >> >> >>>>>  53   hdd   7.32619  1.00000 7.3 TiB 6.2 TiB 6.1
>> > >>>> TiB 792 MiB  18 GiB 1.1 TiB 84.46 1.13 109     up         osd.53
>> > >>>> >> >> >>> >> >> >>>>>  54   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB 5.6
>> > >>>> TiB 959 MiB  15 GiB 1.7 TiB 76.73 1.03 115     up         osd.54
>> > >>>> >> >> >>> >> >> >>>>>  55   hdd   7.32619  1.00000 7.3 TiB 6.1 TiB 6.1
>> > >>>> TiB 699 MiB  16 GiB 1.2 TiB 83.56 1.12 122     up         osd.55
>> > >>>> >> >> >>> >> >> >>>>> -10        51.28331        -  51 TiB  39 TiB  39
>> > >>>> TiB 4.7 GiB 100 GiB  12 TiB 76.05 1.02   -            host s3db9
>> > >>>> >> >> >>> >> >> >>>>>  56   hdd   7.32619  1.00000 7.3 TiB 5.2 TiB 5.2
>> > >>>> TiB 840 MiB  13 GiB 2.1 TiB 71.06 0.95 105     up         osd.56
>> > >>>> >> >> >>> >> >> >>>>>  57   hdd   7.32619  1.00000 7.3 TiB 6.1 TiB 6.0
>> > >>>> TiB 1.0 GiB  16 GiB 1.2 TiB 83.17 1.12 102     up         osd.57
>> > >>>> >> >> >>> >> >> >>>>>  58   hdd   7.32619  1.00000 7.3 TiB 6.0 TiB 5.9
>> > >>>> TiB  43 MiB  15 GiB 1.4 TiB 81.56 1.09 105     up         osd.58
>> > >>>> >> >> >>> >> >> >>>>>  59   hdd   7.32619  1.00000 7.3 TiB 5.9 TiB 5.9
>> > >>>> TiB 429 MiB  15 GiB 1.4 TiB 80.64 1.08  94     up         osd.59
>> > >>>> >> >> >>> >> >> >>>>>  60   hdd   7.32619  1.00000 7.3 TiB 5.4 TiB 5.3
>> > >>>> TiB 226 MiB  14 GiB 2.0 TiB 73.25 0.98 101     up         osd.60
>> > >>>> >> >> >>> >> >> >>>>>  61   hdd   7.32619  1.00000 7.3 TiB 4.8 TiB 4.8
>> > >>>> TiB 1.1 GiB  12 GiB 2.5 TiB 65.84 0.88 103     up         osd.61
>> > >>>> >> >> >>> >> >> >>>>>  62   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB 5.6
>> > >>>> TiB 1.0 GiB  15 GiB 1.7 TiB 76.83 1.03 126     up         osd.62
>> > >>>> >> >> >>> >> >> >>>>>                        TOTAL 674 TiB 501 TiB 473
>> > >>>> TiB  96 GiB 1.2 TiB 173 TiB 74.57
>> > >>>> >> >> >>> >> >> >>>>> MIN/MAX VAR: 0.17/1.20  STDDEV: 10.25
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> Am Sa., 13. März 2021 um 15:57 Uhr schrieb Dan
>> > >>>> van der Ster <dan@xxxxxxxxxxxxxx>:
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>> No, increasing num PGs won't help substantially.
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>> Can you share the entire output of ceph osd df
>> > >>>> tree ?
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>> Did you already set
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>   ceph config set mgr
>> > >>>> mgr/balancer/upmap_max_deviation 1
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>> ??
>> > >>>> >> >> >>> >> >> >>>>>> And I recommend debug_mgr 4/5 so you can see
>> > >>>> some basic upmap balancer logging.
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>> .. Dan
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>
>> > >>>> >> >> >>> >> >> >>>>>> On Sat, Mar 13, 2021, 3:49 PM Boris Behrens <
>> > >>>> bb@xxxxxxxxx> wrote:
>> > >>>> >> >> >>> >> >> >>>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>> Hello people,
>> > >>>> >> >> >>> >> >> >>>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>> I am still struggeling with the balancer
>> > >>>> >> >> >>> >> >> >>>>>>> (
>> > >>>> https://www.mail-archive.com/ceph-users@xxxxxxx/msg09124.html)
>> > >>>> >> >> >>> >> >> >>>>>>> Now I've read some more and might think that I
>> > >>>> do not have enough PGs.
>> > >>>> >> >> >>> >> >> >>>>>>> Currently I have 84OSDs and 1024PGs for the
>> > >>>> main pool (3008 total). I
>> > >>>> >> >> >>> >> >> >>>>>>> have the autoscaler enabled, but I doesn't tell
>> > >>>> me to increase the
>> > >>>> >> >> >>> >> >> >>>>>>> PGs.
>> > >>>> >> >> >>> >> >> >>>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>> What do you think?
>> > >>>> >> >> >>> >> >> >>>>>>>
>> > >>>> >> >> >>> >> >> >>>>>>> --
>> > >>>> >> >> >>> >> >> >>>>>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft
>> > >>>> sich diesmal abweichend
>> > >>>> >> >> >>> >> >> >>>>>>> im groüen Saal.
>> > >>>> >> >> >>> >> >> >>>>>>> _______________________________________________
>> > >>>> >> >> >>> >> >> >>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>> > >>>> >> >> >>> >> >> >>>>>>> To unsubscribe send an email to
>> > >>>> ceph-users-leave@xxxxxxx
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>>
>> > >>>> >> >> >>> >> >> >>>>> --
>> > >>>> >> >> >>> >> >> >>>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft
>> > >>>> sich diesmal abweichend im groüen Saal.
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>>
>> > >>>> >> >> >>> >> >> >>> --
>> > >>>> >> >> >>> >> >> >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich
>> > >>>> diesmal abweichend im groüen Saal.
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> >
>> > >>>> >> >> >>> >> >> > --
>> > >>>> >> >> >>> >> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich
>> > >>>> diesmal abweichend im groüen Saal.
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> >
>> > >>>> >> >> >>> >> > --
>> > >>>> >> >> >>> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich
>> > >>>> diesmal abweichend im groüen Saal.
>> > >>>> >> >> >>> >
>> > >>>> >> >> >>> >
>> > >>>> >> >> >>> >
>> > >>>> >> >> >>> > --
>> > >>>> >> >> >>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal
>> > >>>> abweichend im groüen Saal.
>> > >>>> >> >> >>
>> > >>>> >> >> >>
>> > >>>> >> >> >>
>> > >>>> >> >> >> --
>> > >>>> >> >> >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal
>> > >>>> abweichend im groüen Saal.
>> > >>>> >> >> >
>> > >>>> >> >> >
>> > >>>> >> >> >
>> > >>>> >> >> > --
>> > >>>> >> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal
>> > >>>> abweichend im groüen Saal.
>> > >>>> >> >
>> > >>>> >> >
>> > >>>> >> >
>> > >>>> >> > --
>> > >>>> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal
>> > >>>> abweichend im groüen Saal.
>> > >>>> >
>> > >>>> >
>> > >>>> >
>> > >>>> > --
>> > >>>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
>> > >>>> im groüen Saal.
>> > >>>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> > >>> groüen Saal.
>> > >>>
>> > >>
>> > >
>> > > --
>> > > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> > > groüen Saal.
>> > >
>> >
>> >
>> > --
>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> > groüen Saal.
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux