Re: After adding New Osd's, Pool Max Avail did not changed.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Googling for that balancer error message, I came across
https://tracker.ceph.com/issues/22814, which was closed/wont-fix, and
some threads that claimed that class-based crush rules actually use
some form of shadow trees in the background. I'm not sure how accurate
that is.

The only suggestion I have, which is what was also suggested in one of
the above threads, is to use the upmap balancer instead if possible.

Josh

On Wed, Sep 1, 2021 at 2:38 AM mhnx <morphinwithyou@xxxxxxxxx> wrote:
>
> ceph osd crush tree (I only have one subtree and its root default)
> ID  CLASS WEIGHT     (compat)  TYPE NAME
>  -1       2785.87891           root default
>  -3        280.04803 280.04803     host NODE-1
>   0   hdd   14.60149  14.60149         osd.0
>  19   ssd    0.87320   0.87320         osd.19
> 208   ssd    0.87329   0.87329         osd.208
> 209   ssd    0.87329   0.87329         osd.209
>  -7        280.04803 280.04803     host NODE-2
>  38   hdd   14.60149  14.60149         osd.38
>  39   ssd    0.87320   0.87320         osd.39
> 207   ssd    0.87329   0.87329         osd.207
> 210   ssd    0.87329   0.87329         osd.210
> -10        280.04803 280.04803     host NODE-3
>  58   hdd   14.60149  14.60149         osd.58
>  59   ssd    0.87320   0.87320         osd.59
> 203   ssd    0.87329   0.87329         osd.203
> 211   ssd    0.87329   0.87329         osd.211
> -13        280.04803 280.04803     host NODE-4
>  78   hdd   14.60149  14.60149         osd.78
>  79   ssd    0.87320   0.87320         osd.79
> 206   ssd    0.87329   0.87329         osd.206
> 212   ssd    0.87329   0.87329         osd.212
> -16        280.04803 280.04803     host NODE-5
>  98   hdd   14.60149  14.60149         osd.98
>  99   ssd    0.87320   0.87320         osd.99
> 205   ssd    0.87329   0.87329         osd.205
> 213   ssd    0.87329   0.87329         osd.213
> -19        265.44662 265.44662     host NODE-6
> 118   hdd   14.60149  14.60149         osd.118
> 114   ssd    0.87329   0.87329         osd.114
> 200   ssd    0.87329   0.87329         osd.200
> 214   ssd    0.87329   0.87329         osd.214
> -22        280.04803 280.04803     host NODE-7
> 138   hdd   14.60149  14.60149         osd.138
> 139   ssd    0.87320   0.87320         osd.139
> 204   ssd    0.87329   0.87329         osd.204
> 215   ssd    0.87329   0.87329         osd.215
> -25        280.04810 280.04810     host NODE-8
> 158   hdd   14.60149  14.60149         osd.158
> 119   ssd    0.87329   0.87329         osd.119
> 159   ssd    0.87329   0.87329         osd.159
> 216   ssd    0.87329   0.87329         osd.216
> -28        280.04810 280.04810     host NODE-9
> 178   hdd   14.60149  14.60149         osd.178
> 179   ssd    0.87329   0.87329         osd.179
> 201   ssd    0.87329   0.87329         osd.201
> 217   ssd    0.87329   0.87329         osd.217
> -31        280.04803 280.04803     host NODE-10
> 180   hdd   14.60149  14.60149         osd.180
> 199   ssd    0.87320   0.87320         osd.199
> 202   ssd    0.87329   0.87329         osd.202
> 218   ssd    0.87329   0.87329         osd.218
>
> This pg "6.dc" is on 199,213,217 OSD's.
>
> 6.dc        812                  0        0         0       0   1369675264           0          0 3005     3005                active+clean 2021-08-31 16:36:06.645208    32265'415965  32265:287175109                        [199,213,217]
>
> ceph osd df tree | grep "CLASS\|ssd" | grep ".199\|.213\|217"
> 199   ssd    0.87320  1.00000 894 GiB 281 GiB 119 GiB 159 GiB 2.5 GiB 614 GiB 31.38 0.52 103     up         osd.199
> 213   ssd    0.87329  1.00000 894 GiB 291 GiB  95 GiB 195 GiB 2.3 GiB 603 GiB 32.59 0.54  95     up         osd.213
> 217   ssd    0.87329  1.00000 894 GiB 261 GiB  83 GiB 176 GiB 2.3 GiB 633 GiB 29.18 0.48  89     up         osd.217
>
> As you can see the pg lives on 3 ssd OSD's and one of them is the new one. So we can not say it belongs to someone else.
>
> rule ssd-rule {
> id 1
> type replicated
> step take default class ssd
> step chooseleaf firstn 0 type host
> step emit
> }
>
> pool 54 'rgw.buckets.index' replicated size 3 min_size 1 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change 31607 lfor 0/0/30823 flags hashpspool stripe_width 0 compression_algorithm lz4 compression_mode aggressive application rgw
>
> What is the next step?
>
>
> Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx>, 1 Eyl 2021 Çar, 04:03 tarihinde şunu yazdı:
>>
>> Yeah, I would suggest inspecting your CRUSH tree. Unfortunately the
>> grep above removed that information from 'df tree', but from the
>> information you provided there does appear to be a significant
>> imbalance remaining.
>>
>> Josh
>>
>> On Tue, Aug 31, 2021 at 6:02 PM mhnx <morphinwithyou@xxxxxxxxx> wrote:
>> >
>> > Hello Josh!
>> >
>> > I use balancer active - crush-compat. Balance is done and there are no remapped pgs at ceph -s
>> >
>> > ceph osd df tree | grep 'CLASS\|ssd'
>> >
>> > ID  CLASS WEIGHT     REWEIGHT SIZE    RAW USE DATA    OMAP    META    AVAIL   %USE  VAR  PGS STATUS TYPE NAME
>> >  19   ssd    0.87320  1.00000 894 GiB 402 GiB 117 GiB 281 GiB 3.0 GiB 492 GiB 44.93 0.74 102     up         osd.19
>> > 208   ssd    0.87329  1.00000 894 GiB 205 GiB  85 GiB 113 GiB 6.6 GiB 690 GiB 22.89 0.38  95     up         osd.208
>> > 209   ssd    0.87329  1.00000 894 GiB 204 GiB  87 GiB 114 GiB 2.7 GiB 690 GiB 22.84 0.38  65     up         osd.209
>> > 199   ssd    0.87320  1.00000 894 GiB 281 GiB 118 GiB 159 GiB 2.8 GiB 614 GiB 31.37 0.52 103     up         osd.199
>> > 202   ssd    0.87329  1.00000 894 GiB 278 GiB  89 GiB 183 GiB 6.3 GiB 616 GiB 31.08 0.51  97     up         osd.202
>> > 218   ssd    0.87329  1.00000 894 GiB 201 GiB  75 GiB 124 GiB 1.8 GiB 693 GiB 22.46 0.37  84     up         osd.218
>> >  39   ssd    0.87320  1.00000 894 GiB 334 GiB  86 GiB 242 GiB 5.3 GiB 560 GiB 37.34 0.61  91     up         osd.39
>> > 207   ssd    0.87329  1.00000 894 GiB 232 GiB  88 GiB 138 GiB 7.0 GiB 662 GiB 25.99 0.43  81     up         osd.207
>> > 210   ssd    0.87329  1.00000 894 GiB 270 GiB 109 GiB 160 GiB 1.4 GiB 624 GiB 30.18 0.50  99     up         osd.210
>> >  59   ssd    0.87320  1.00000 894 GiB 374 GiB 127 GiB 244 GiB 3.1 GiB 520 GiB 41.79 0.69  97     up         osd.59
>> > 203   ssd    0.87329  1.00000 894 GiB 314 GiB  96 GiB 210 GiB 7.5 GiB 581 GiB 35.06 0.58 104     up         osd.203
>> > 211   ssd    0.87329  1.00000 894 GiB 231 GiB  60 GiB 169 GiB 1.7 GiB 663 GiB 25.82 0.42  81     up         osd.211
>> >  79   ssd    0.87320  1.00000 894 GiB 409 GiB 109 GiB 298 GiB 2.0 GiB 486 GiB 45.70 0.75 102     up         osd.79
>> > 206   ssd    0.87329  1.00000 894 GiB 284 GiB 107 GiB 175 GiB 1.9 GiB 610 GiB 31.79 0.52  94     up         osd.206
>> > 212   ssd    0.87329  1.00000 894 GiB 239 GiB  85 GiB 152 GiB 2.0 GiB 655 GiB 26.71 0.44  80     up         osd.212
>> >  99   ssd    0.87320  1.00000 894 GiB 392 GiB  73 GiB 314 GiB 4.7 GiB 503 GiB 43.79 0.72  85     up         osd.99
>> > 205   ssd    0.87329  1.00000 894 GiB 445 GiB  87 GiB 353 GiB 4.8 GiB 449 GiB 49.80 0.82  95     up         osd.205
>> > 213   ssd    0.87329  1.00000 894 GiB 291 GiB  94 GiB 194 GiB 2.3 GiB 603 GiB 32.57 0.54  95     up         osd.213
>> > 114   ssd    0.87329  1.00000 894 GiB 319 GiB 125 GiB 191 GiB 3.0 GiB 575 GiB 35.67 0.59  99     up         osd.114
>> > 200   ssd    0.87329  1.00000 894 GiB 231 GiB  78 GiB 150 GiB 2.9 GiB 663 GiB 25.83 0.42  90     up         osd.200
>> > 214   ssd    0.87329  1.00000 894 GiB 296 GiB 106 GiB 187 GiB 2.6 GiB 598 GiB 33.09 0.54 100     up         osd.214
>> > 139   ssd    0.87320  1.00000 894 GiB 270 GiB  98 GiB 169 GiB 2.3 GiB 624 GiB 30.18 0.50  96     up         osd.139
>> > 204   ssd    0.87329  1.00000 894 GiB 301 GiB 117 GiB 181 GiB 2.9 GiB 593 GiB 33.64 0.55 104     up         osd.204
>> > 215   ssd    0.87329  1.00000 894 GiB 203 GiB  78 GiB 122 GiB 3.3 GiB 691 GiB 22.69 0.37  81     up         osd.215
>> > 119   ssd    0.87329  1.00000 894 GiB 200 GiB 106 GiB  92 GiB 2.0 GiB 694 GiB 22.39 0.37  99     up         osd.119
>> > 159   ssd    0.87329  1.00000 894 GiB 213 GiB  96 GiB 113 GiB 3.2 GiB 682 GiB 23.77 0.39  93     up         osd.159
>> > 216   ssd    0.87329  1.00000 894 GiB 322 GiB 109 GiB 211 GiB 1.8 GiB 573 GiB 35.96 0.59 101     up         osd.216
>> > 179   ssd    0.87329  1.00000 894 GiB 389 GiB  85 GiB 300 GiB 3.2 GiB 505 GiB 43.49 0.71 104     up         osd.179
>> > 201   ssd    0.87329  1.00000 894 GiB 494 GiB 104 GiB 386 GiB 4.1 GiB 401 GiB 55.20 0.91 103     up         osd.201
>> > 217   ssd    0.87329  1.00000 894 GiB 261 GiB  83 GiB 176 GiB 2.3 GiB 634 GiB 29.15 0.48  89     up         osd.217
>> >
>> >
>> > When I check the balancer status I saw that: ""optimize_result": "Some osds belong to multiple subtrees:"
>> > Do I need to check crushmap?
>> >
>> >
>> >
>> > Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx>, 31 Ağu 2021 Sal, 22:32 tarihinde şunu yazdı:
>> >>
>> >> Hi there,
>> >>
>> >> Could you post the output of "ceph osd df tree"? I would highly
>> >> suspect that this is a result of imbalance, and that's the easiest way
>> >> to see if that's the case. It would also confirm that the new disks
>> >> have taken on PGs.
>> >>
>> >> Josh
>> >>
>> >> On Tue, Aug 31, 2021 at 10:50 AM mhnx <morphinwithyou@xxxxxxxxx> wrote:
>> >> >
>> >> > I'm using Nautilus 14.2.16
>> >> >
>> >> > I was have 20 ssd OSD in my cluster and I added 10 more. " Each SSD=960GB"
>> >> > The Size increased to *(26TiB)* as expected but the Replicated (3) Pool Max
>> >> > Avail didn't changed *(3.5TiB)*.
>> >> > I've increased pg_num and PG rebalance is also done.
>> >> >
>> >> > Do I need any special treatment to expand the pool Max Avail?
>> >> >
>> >> > CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
>> >> >     hdd       2.7 PiB     1.0 PiB     1.6 PiB      1.6 PiB         61.12
>> >> >     ssd        *26 TiB*      18 TiB     2.8 TiB      8.7 TiB         33.11
>> >> >     TOTAL     2.7 PiB     1.1 PiB     1.6 PiB      1.7 PiB         60.85
>> >> >
>> >> > POOLS:
>> >> >     POOL                        ID     PGS      STORED      OBJECTS
>> >> >  USED        %USED     MAX AVAIL
>> >> >     xxx.rgw.buckets.index      54      128     541 GiB     435.69k     541
>> >> > GiB      4.82       *3.5 TiB*
>> >> > _______________________________________________
>> >> > ceph-users mailing list -- ceph-users@xxxxxxx
>> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux