Re: should I increase the amount of PGs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



OK
Btw, you might need to fail to a new mgr... I'm not sure if the current
active will read that new config.

.. dan


On Sat, Mar 13, 2021, 4:36 PM Boris Behrens <bb@xxxxxxxxx> wrote:

> Hi,
>
> ok thanks. I just changed the value and rewighted everything back to 1.
> Now I let it sync the weekend and check how it will be on monday.
> We tried to have the systems total storage balanced as possible. New
> systems will be with 8TB disks but for the exiting ones we added 16TB to
> offset the 4TB disks and we needed a lot of storage fast, because of a DC
> move. If you have any recommendations I would be happy to hear them.
>
> Cheers
>  Boris
>
> Am Sa., 13. März 2021 um 16:20 Uhr schrieb Dan van der Ster <
> dan@xxxxxxxxxxxxxx>:
>
>> Thanks.
>>
>> Decreasing the max deviation to 2 or 1 should help in your case. This
>> option controls when the balancer stops trying to move PGs around -- by
>> default it stops when the deviation from the mean is 5. Yes this is too
>> large IMO -- all of our clusters have this set to 1.
>>
>> And given that you have some OSDs with more than 200 PGs, you definitely
>> shouldn't increase the num PGs.
>>
>> But anyway with your mixed device sizes it might be challenging to make a
>> perfectly uniform distribution. Give it a try with 1 though, and let us
>> know how it goes.
>>
>> .. Dan
>>
>>
>>
>>
>>
>> On Sat, Mar 13, 2021, 4:11 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>>
>>> Hi Dan,
>>>
>>> upmap_max_deviation is default (5) in our cluster. Is 1 the recommended
>>> deviation?
>>>
>>> I added the whole ceph osd df tree, (I need to remove some OSDs and
>>> readd them as bluestore with SSD, so 69, 73 and 82 are a bit off now. I
>>> also reweighted to try to get the %USE mitigated).
>>>
>>> I will increase the mgr debugging to see what is the problem.
>>>
>>> [root@s3db1 ~]# ceph osd df tree
>>> ID  CLASS WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP    META
>>>  AVAIL   %USE  VAR  PGS STATUS TYPE NAME
>>>  -1       673.54224        - 659 TiB 491 TiB 464 TiB  96 GiB 1.2 TiB 168
>>> TiB 74.57 1.00   -        root default
>>>  -2        58.30331        -  44 TiB  22 TiB  17 TiB 5.7 GiB  38 GiB  22
>>> TiB 49.82 0.67   -            host s3db1
>>>  23   hdd  14.65039  1.00000  15 TiB 1.8 TiB 1.7 TiB 156 MiB 4.4 GiB  13
>>> TiB 12.50 0.17 101     up         osd.23
>>>  69   hdd  14.55269        0     0 B     0 B     0 B     0 B     0 B
>>> 0 B     0    0  11     up         osd.69
>>>  73   hdd  14.55269  1.00000  15 TiB  10 TiB  10 TiB 6.1 MiB  33 GiB 4.2
>>> TiB 71.15 0.95 107     up         osd.73
>>>  79   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 747 GiB 2.0 GiB     0 B 747
>>> GiB 79.94 1.07  52     up         osd.79
>>>  80   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1.0 TiB 1.9 GiB     0 B 1.0
>>> TiB 71.61 0.96  58     up         osd.80
>>>  81   hdd   3.63689  1.00000 3.6 TiB 2.2 TiB 1.5 TiB 1.1 GiB     0 B 1.5
>>> TiB 60.07 0.81  55     up         osd.81
>>>  82   hdd   3.63689  1.00000 3.6 TiB 1.9 TiB 1.7 TiB 536 MiB     0 B 1.7
>>> TiB 52.68 0.71  30     up         osd.82
>>> -11        50.94173        -  51 TiB  38 TiB  38 TiB 3.7 GiB 100 GiB  13
>>> TiB 74.69 1.00   -            host s3db10
>>>  63   hdd   7.27739  1.00000 7.3 TiB 5.5 TiB 5.5 TiB 616 MiB  14 GiB 1.7
>>> TiB 76.04 1.02  92     up         osd.63
>>>  64   hdd   7.27739  1.00000 7.3 TiB 5.5 TiB 5.5 TiB 820 MiB  15 GiB 1.8
>>> TiB 75.54 1.01 101     up         osd.64
>>>  65   hdd   7.27739  1.00000 7.3 TiB 5.3 TiB 5.3 TiB 109 MiB  14 GiB 2.0
>>> TiB 73.17 0.98 105     up         osd.65
>>>  66   hdd   7.27739  1.00000 7.3 TiB 5.8 TiB 5.8 TiB 423 MiB  15 GiB 1.4
>>> TiB 80.38 1.08  98     up         osd.66
>>>  67   hdd   7.27739  1.00000 7.3 TiB 5.1 TiB 5.1 TiB 572 MiB  14 GiB 2.2
>>> TiB 70.10 0.94 100     up         osd.67
>>>  68   hdd   7.27739  1.00000 7.3 TiB 5.3 TiB 5.3 TiB 630 MiB  13 GiB 2.0
>>> TiB 72.88 0.98 107     up         osd.68
>>>  70   hdd   7.27739  1.00000 7.3 TiB 5.4 TiB 5.4 TiB 648 MiB  14 GiB 1.8
>>> TiB 74.73 1.00 102     up         osd.70
>>> -12        50.99052        -  51 TiB  39 TiB  39 TiB 2.9 GiB  99 GiB  12
>>> TiB 77.24 1.04   -            host s3db11
>>>  46   hdd   7.27739  1.00000 7.3 TiB 5.7 TiB 5.7 TiB 102 MiB  15 GiB 1.5
>>> TiB 78.91 1.06  97     up         osd.46
>>>  47   hdd   7.27739  1.00000 7.3 TiB 5.2 TiB 5.2 TiB  61 MiB  13 GiB 2.1
>>> TiB 71.47 0.96  96     up         osd.47
>>>  48   hdd   7.27739  1.00000 7.3 TiB 6.1 TiB 6.1 TiB 853 MiB  15 GiB 1.2
>>> TiB 83.46 1.12 109     up         osd.48
>>>  49   hdd   7.27739  1.00000 7.3 TiB 5.7 TiB 5.7 TiB 708 MiB  15 GiB 1.5
>>> TiB 78.96 1.06  98     up         osd.49
>>>  50   hdd   7.27739  1.00000 7.3 TiB 5.9 TiB 5.8 TiB 472 MiB  15 GiB 1.4
>>> TiB 80.40 1.08 102     up         osd.50
>>>  51   hdd   7.27739  1.00000 7.3 TiB 5.9 TiB 5.9 TiB 729 MiB  15 GiB 1.3
>>> TiB 81.70 1.10 110     up         osd.51
>>>  72   hdd   7.32619  1.00000 7.3 TiB 4.8 TiB 4.8 TiB  91 MiB  12 GiB 2.5
>>> TiB 65.82 0.88  89     up         osd.72
>>> -37        58.55478        -  59 TiB  46 TiB  46 TiB 5.0 GiB 124 GiB  12
>>> TiB 79.04 1.06   -            host s3db12
>>>  19   hdd   3.68750  1.00000 3.7 TiB 3.1 TiB 3.1 TiB 462 MiB 8.2 GiB 559
>>> GiB 85.18 1.14  55     up         osd.19
>>>  71   hdd   3.68750  1.00000 3.7 TiB 2.9 TiB 2.8 TiB 3.9 MiB 7.8 GiB 825
>>> GiB 78.14 1.05  50     up         osd.71
>>>  75   hdd   3.68750  1.00000 3.7 TiB 3.1 TiB 3.1 TiB 576 MiB 8.3 GiB 555
>>> GiB 85.29 1.14  57     up         osd.75
>>>  76   hdd   3.68750  1.00000 3.7 TiB 3.2 TiB 3.1 TiB 239 MiB 9.3 GiB 501
>>> GiB 86.73 1.16  50     up         osd.76
>>>  77   hdd  14.60159  1.00000  15 TiB  11 TiB  11 TiB 880 MiB  30 GiB 3.6
>>> TiB 75.57 1.01 202     up         osd.77
>>>  78   hdd  14.60159  1.00000  15 TiB  11 TiB  11 TiB 1.0 GiB  30 GiB 3.4
>>> TiB 76.65 1.03 196     up         osd.78
>>>  83   hdd  14.60159  1.00000  15 TiB  12 TiB  12 TiB 1.8 GiB  31 GiB 2.9
>>> TiB 80.04 1.07 223     up         osd.83
>>>  -3        58.49872        -  58 TiB  43 TiB  38 TiB 8.1 GiB  91 GiB  16
>>> TiB 73.15 0.98   -            host s3db2
>>>   1   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 3.1 GiB  38 GiB 3.6
>>> TiB 75.52 1.01 194     up         osd.1
>>>   3   hdd   3.63689  1.00000 3.6 TiB 2.2 TiB 1.4 TiB 418 MiB     0 B 1.4
>>> TiB 60.94 0.82  52     up         osd.3
>>>   4   hdd   3.63689  0.89999 3.6 TiB 3.2 TiB 401 GiB 845 MiB     0 B 401
>>> GiB 89.23 1.20  53     up         osd.4
>>>   5   hdd   3.63689  1.00000 3.6 TiB 2.3 TiB 1.3 TiB 437 MiB     0 B 1.3
>>> TiB 62.88 0.84  51     up         osd.5
>>>   6   hdd   3.63689  1.00000 3.6 TiB 2.0 TiB 1.7 TiB 1.8 GiB     0 B 1.7
>>> TiB 54.51 0.73  47     up         osd.6
>>>   7   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 493 MiB  26 GiB 3.8
>>> TiB 73.90 0.99 185     up         osd.7
>>>  74   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 1.1 GiB  27 GiB 3.5
>>> TiB 76.27 1.02 208     up         osd.74
>>>  -4        58.49872        -  58 TiB  43 TiB  37 TiB  33 GiB  86 GiB  15
>>> TiB 74.05 0.99   -            host s3db3
>>>   2   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 850 MiB  26 GiB 4.0
>>> TiB 72.78 0.98 203     up         osd.2
>>>   9   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 8.3 GiB  33 GiB 3.6
>>> TiB 75.62 1.01 189     up         osd.9
>>>  10   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 663 MiB  28 GiB 3.5
>>> TiB 76.34 1.02 211     up         osd.10
>>>  12   hdd   3.63689  1.00000 3.6 TiB 2.4 TiB 1.2 TiB 633 MiB     0 B 1.2
>>> TiB 66.22 0.89  44     up         osd.12
>>>  13   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 720 GiB 2.3 GiB     0 B 720
>>> GiB 80.66 1.08  66     up         osd.13
>>>  14   hdd   3.63689  1.00000 3.6 TiB 3.1 TiB 552 GiB  18 GiB     0 B 552
>>> GiB 85.18 1.14  60     up         osd.14
>>>  15   hdd   3.63689  1.00000 3.6 TiB 2.0 TiB 1.7 TiB 2.1 GiB     0 B 1.7
>>> TiB 53.72 0.72  44     up         osd.15
>>>  -5        58.49872        -  58 TiB  45 TiB  37 TiB 7.2 GiB  99 GiB  14
>>> TiB 76.37 1.02   -            host s3db4
>>>  11   hdd  14.65039  1.00000  15 TiB  12 TiB  12 TiB 897 MiB  28 GiB 2.8
>>> TiB 81.15 1.09 205     up         osd.11
>>>  17   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 1.2 GiB  27 GiB 3.6
>>> TiB 75.38 1.01 211     up         osd.17
>>>  18   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 965 MiB  44 GiB 4.0
>>> TiB 72.86 0.98 188     up         osd.18
>>>  20   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 796 GiB 529 MiB     0 B 796
>>> GiB 78.63 1.05  66     up         osd.20
>>>  21   hdd   3.63689  1.00000 3.6 TiB 2.6 TiB 1.1 TiB 2.1 GiB     0 B 1.1
>>> TiB 70.32 0.94  47     up         osd.21
>>>  22   hdd   3.63689  1.00000 3.6 TiB 2.9 TiB 802 GiB 882 MiB     0 B 802
>>> GiB 78.47 1.05  58     up         osd.22
>>>  24   hdd   3.63689  1.00000 3.6 TiB 2.8 TiB 856 GiB 645 MiB     0 B 856
>>> GiB 77.01 1.03  47     up         osd.24
>>>  -6        58.89636        -  59 TiB  44 TiB  44 TiB 2.4 GiB 111 GiB  15
>>> TiB 75.22 1.01   -            host s3db5
>>>   0   hdd   3.73630  1.00000 3.7 TiB 2.4 TiB 2.3 TiB  70 MiB 6.6 GiB 1.3
>>> TiB 65.00 0.87  48     up         osd.0
>>>  25   hdd   3.73630  1.00000 3.7 TiB 2.4 TiB 2.3 TiB 5.3 MiB 6.6 GiB 1.4
>>> TiB 63.86 0.86  41     up         osd.25
>>>  26   hdd   3.73630  1.00000 3.7 TiB 2.9 TiB 2.8 TiB 181 MiB 7.6 GiB 862
>>> GiB 77.47 1.04  48     up         osd.26
>>>  27   hdd   3.73630  1.00000 3.7 TiB 2.3 TiB 2.2 TiB 7.0 MiB 6.1 GiB 1.5
>>> TiB 61.00 0.82  48     up         osd.27
>>>  28   hdd  14.65039  1.00000  15 TiB  12 TiB  12 TiB 937 MiB  30 GiB 2.8
>>> TiB 81.19 1.09 203     up         osd.28
>>>  29   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 536 MiB  26 GiB 3.8
>>> TiB 73.95 0.99 200     up         osd.29
>>>  30   hdd  14.65039  1.00000  15 TiB  12 TiB  11 TiB 744 MiB  28 GiB 3.1
>>> TiB 79.07 1.06 207     up         osd.30
>>>  -7        58.89636        -  59 TiB  44 TiB  44 TiB  14 GiB 122 GiB  14
>>> TiB 75.41 1.01   -            host s3db6
>>>  32   hdd   3.73630  1.00000 3.7 TiB 3.1 TiB 3.0 TiB  16 MiB 8.2 GiB 622
>>> GiB 83.74 1.12  65     up         osd.32
>>>  33   hdd   3.73630  0.79999 3.7 TiB 3.0 TiB 2.9 TiB  14 MiB 8.1 GiB 740
>>> GiB 80.67 1.08  52     up         osd.33
>>>  34   hdd   3.73630  0.79999 3.7 TiB 2.9 TiB 2.8 TiB 449 MiB 7.7 GiB 877
>>> GiB 77.08 1.03  52     up         osd.34
>>>  35   hdd   3.73630  0.79999 3.7 TiB 2.3 TiB 2.2 TiB 133 MiB 7.0 GiB 1.4
>>> TiB 62.18 0.83  42     up         osd.35
>>>  36   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 544 MiB  26 GiB 4.0
>>> TiB 72.98 0.98 220     up         osd.36
>>>  37   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB  11 GiB  38 GiB 3.6
>>> TiB 75.30 1.01 200     up         osd.37
>>>  38   hdd  14.65039  1.00000  15 TiB  11 TiB  11 TiB 1.2 GiB  28 GiB 3.3
>>> TiB 77.43 1.04 217     up         osd.38
>>>  -8        58.89636        -  59 TiB  47 TiB  46 TiB 3.9 GiB 116 GiB  12
>>> TiB 78.98 1.06   -            host s3db7
>>>  39   hdd   3.73630  1.00000 3.7 TiB 3.2 TiB 3.2 TiB  19 MiB 8.5 GiB 499
>>> GiB 86.96 1.17  43     up         osd.39
>>>  40   hdd   3.73630  1.00000 3.7 TiB 2.6 TiB 2.5 TiB 144 MiB 7.0 GiB 1.2
>>> TiB 68.33 0.92  39     up         osd.40
>>>  41   hdd   3.73630  1.00000 3.7 TiB 3.0 TiB 2.9 TiB 218 MiB 7.9 GiB 732
>>> GiB 80.86 1.08  64     up         osd.41
>>>  42   hdd   3.73630  1.00000 3.7 TiB 2.5 TiB 2.4 TiB 594 MiB 7.0 GiB 1.2
>>> TiB 67.97 0.91  50     up         osd.42
>>>  43   hdd  14.65039  1.00000  15 TiB  12 TiB  12 TiB 564 MiB  28 GiB 2.9
>>> TiB 80.32 1.08 213     up         osd.43
>>>  44   hdd  14.65039  1.00000  15 TiB  12 TiB  11 TiB 1.3 GiB  28 GiB 3.1
>>> TiB 78.59 1.05 198     up         osd.44
>>>  45   hdd  14.65039  1.00000  15 TiB  12 TiB  12 TiB 1.2 GiB  30 GiB 2.8
>>> TiB 81.05 1.09 214     up         osd.45
>>>  -9        51.28331        -  51 TiB  41 TiB  41 TiB 4.9 GiB 108 GiB  10
>>> TiB 79.75 1.07   -            host s3db8
>>>   8   hdd   7.32619  1.00000 7.3 TiB 5.8 TiB 5.8 TiB 472 MiB  15 GiB 1.5
>>> TiB 79.68 1.07  99     up         osd.8
>>>  16   hdd   7.32619  1.00000 7.3 TiB 5.9 TiB 5.8 TiB 785 MiB  15 GiB 1.4
>>> TiB 80.25 1.08  97     up         osd.16
>>>  31   hdd   7.32619  1.00000 7.3 TiB 5.5 TiB 5.5 TiB 438 MiB  14 GiB 1.8
>>> TiB 75.36 1.01  87     up         osd.31
>>>  52   hdd   7.32619  1.00000 7.3 TiB 5.7 TiB 5.7 TiB 844 MiB  15 GiB 1.6
>>> TiB 78.19 1.05 113     up         osd.52
>>>  53   hdd   7.32619  1.00000 7.3 TiB 6.2 TiB 6.1 TiB 792 MiB  18 GiB 1.1
>>> TiB 84.46 1.13 109     up         osd.53
>>>  54   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB 5.6 TiB 959 MiB  15 GiB 1.7
>>> TiB 76.73 1.03 115     up         osd.54
>>>  55   hdd   7.32619  1.00000 7.3 TiB 6.1 TiB 6.1 TiB 699 MiB  16 GiB 1.2
>>> TiB 83.56 1.12 122     up         osd.55
>>> -10        51.28331        -  51 TiB  39 TiB  39 TiB 4.7 GiB 100 GiB  12
>>> TiB 76.05 1.02   -            host s3db9
>>>  56   hdd   7.32619  1.00000 7.3 TiB 5.2 TiB 5.2 TiB 840 MiB  13 GiB 2.1
>>> TiB 71.06 0.95 105     up         osd.56
>>>  57   hdd   7.32619  1.00000 7.3 TiB 6.1 TiB 6.0 TiB 1.0 GiB  16 GiB 1.2
>>> TiB 83.17 1.12 102     up         osd.57
>>>  58   hdd   7.32619  1.00000 7.3 TiB 6.0 TiB 5.9 TiB  43 MiB  15 GiB 1.4
>>> TiB 81.56 1.09 105     up         osd.58
>>>  59   hdd   7.32619  1.00000 7.3 TiB 5.9 TiB 5.9 TiB 429 MiB  15 GiB 1.4
>>> TiB 80.64 1.08  94     up         osd.59
>>>  60   hdd   7.32619  1.00000 7.3 TiB 5.4 TiB 5.3 TiB 226 MiB  14 GiB 2.0
>>> TiB 73.25 0.98 101     up         osd.60
>>>  61   hdd   7.32619  1.00000 7.3 TiB 4.8 TiB 4.8 TiB 1.1 GiB  12 GiB 2.5
>>> TiB 65.84 0.88 103     up         osd.61
>>>  62   hdd   7.32619  1.00000 7.3 TiB 5.6 TiB 5.6 TiB 1.0 GiB  15 GiB 1.7
>>> TiB 76.83 1.03 126     up         osd.62
>>>                        TOTAL 674 TiB 501 TiB 473 TiB  96 GiB 1.2 TiB 173
>>> TiB 74.57
>>> MIN/MAX VAR: 0.17/1.20  STDDEV: 10.25
>>>
>>>
>>>
>>> Am Sa., 13. März 2021 um 15:57 Uhr schrieb Dan van der Ster <
>>> dan@xxxxxxxxxxxxxx>:
>>>
>>>> No, increasing num PGs won't help substantially.
>>>>
>>>> Can you share the entire output of ceph osd df tree ?
>>>>
>>>> Did you already set
>>>>
>>>>   ceph config set mgr mgr/balancer/upmap_max_deviation 1
>>>>
>>>>
>>>> ??
>>>> And I recommend debug_mgr 4/5 so you can see some basic upmap balancer
>>>> logging.
>>>>
>>>> .. Dan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Mar 13, 2021, 3:49 PM Boris Behrens <bb@xxxxxxxxx> wrote:
>>>>
>>>>> Hello people,
>>>>>
>>>>> I am still struggeling with the balancer
>>>>> (https://www.mail-archive.com/ceph-users@xxxxxxx/msg09124.html)
>>>>> Now I've read some more and might think that I do not have enough PGs.
>>>>> Currently I have 84OSDs and 1024PGs for the main pool (3008 total). I
>>>>> have the autoscaler enabled, but I doesn't tell me to increase the
>>>>> PGs.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> --
>>>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
>>>>> im groüen Saal.
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>>
>>>>
>>>
>>> --
>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>>> groüen Saal.
>>>
>>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux