Hi, this unfortunally did not solve my problem. I still have some OSDs that fill up to 85% According to the logging, the autoscaler might want to add more PGs to one Bucken and reduce almost all other buckets to 32. 2021-03-15 12:19:58.825 7f307f601700 4 mgr[pg_autoscaler] Pool 'eu-central-1.rgw.buckets.data' root_id -1 using 0.705080476146 of space, bias 1.0, pg target 1974.22533321 quantized to 2048 (current 1024) Why the balancing does not happen is still nebulous to me. Am Sa., 13. März 2021 um 16:37 Uhr schrieb Dan van der Ster < dan@xxxxxxxxxxxxxx>: > OK > Btw, you might need to fail to a new mgr... I'm not sure if the current > active will read that new config. > > .. dan > > > On Sat, Mar 13, 2021, 4:36 PM Boris Behrens <bb@xxxxxxxxx> wrote: > >> Hi, >> >> ok thanks. I just changed the value and rewighted everything back to 1. >> Now I let it sync the weekend and check how it will be on monday. >> We tried to have the systems total storage balanced as possible. New >> systems will be with 8TB disks but for the exiting ones we added 16TB to >> offset the 4TB disks and we needed a lot of storage fast, because of a DC >> move. If you have any recommendations I would be happy to hear them. >> >> Cheers >> Boris >> >> Am Sa., 13. März 2021 um 16:20 Uhr schrieb Dan van der Ster < >> dan@xxxxxxxxxxxxxx>: >> >>> Thanks. >>> >>> Decreasing the max deviation to 2 or 1 should help in your case. This >>> option controls when the balancer stops trying to move PGs around -- by >>> default it stops when the deviation from the mean is 5. Yes this is too >>> large IMO -- all of our clusters have this set to 1. >>> >>> And given that you have some OSDs with more than 200 PGs, you definitely >>> shouldn't increase the num PGs. >>> >>> But anyway with your mixed device sizes it might be challenging to make >>> a perfectly uniform distribution. Give it a try with 1 though, and let us >>> know how it goes. >>> >>> .. Dan >>> >>> >>> >>> >>> >>> On Sat, Mar 13, 2021, 4:11 PM Boris Behrens <bb@xxxxxxxxx> wrote: >>> >>>> Hi Dan, >>>> >>>> upmap_max_deviation is default (5) in our cluster. Is 1 the recommended >>>> deviation? >>>> >>>> I added the whole ceph osd df tree, (I need to remove some OSDs and >>>> readd them as bluestore with SSD, so 69, 73 and 82 are a bit off now. I >>>> also reweighted to try to get the %USE mitigated). >>>> >>>> I will increase the mgr debugging to see what is the problem. >>>> >>>> [root@s3db1 ~]# ceph osd df tree >>>> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META >>>> AVAIL %USE VAR PGS STATUS TYPE NAME >>>> -1 673.54224 - 659 TiB 491 TiB 464 TiB 96 GiB 1.2 TiB >>>> 168 TiB 74.57 1.00 - root default >>>> -2 58.30331 - 44 TiB 22 TiB 17 TiB 5.7 GiB 38 GiB >>>> 22 TiB 49.82 0.67 - host s3db1 >>>> 23 hdd 14.65039 1.00000 15 TiB 1.8 TiB 1.7 TiB 156 MiB 4.4 GiB >>>> 13 TiB 12.50 0.17 101 up osd.23 >>>> 69 hdd 14.55269 0 0 B 0 B 0 B 0 B 0 B >>>> 0 B 0 0 11 up osd.69 >>>> 73 hdd 14.55269 1.00000 15 TiB 10 TiB 10 TiB 6.1 MiB 33 GiB >>>> 4.2 TiB 71.15 0.95 107 up osd.73 >>>> 79 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 747 GiB 2.0 GiB 0 B >>>> 747 GiB 79.94 1.07 52 up osd.79 >>>> 80 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1.0 TiB 1.9 GiB 0 B >>>> 1.0 TiB 71.61 0.96 58 up osd.80 >>>> 81 hdd 3.63689 1.00000 3.6 TiB 2.2 TiB 1.5 TiB 1.1 GiB 0 B >>>> 1.5 TiB 60.07 0.81 55 up osd.81 >>>> 82 hdd 3.63689 1.00000 3.6 TiB 1.9 TiB 1.7 TiB 536 MiB 0 B >>>> 1.7 TiB 52.68 0.71 30 up osd.82 >>>> -11 50.94173 - 51 TiB 38 TiB 38 TiB 3.7 GiB 100 GiB >>>> 13 TiB 74.69 1.00 - host s3db10 >>>> 63 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 616 MiB 14 GiB >>>> 1.7 TiB 76.04 1.02 92 up osd.63 >>>> 64 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 820 MiB 15 GiB >>>> 1.8 TiB 75.54 1.01 101 up osd.64 >>>> 65 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 TiB 109 MiB 14 GiB >>>> 2.0 TiB 73.17 0.98 105 up osd.65 >>>> 66 hdd 7.27739 1.00000 7.3 TiB 5.8 TiB 5.8 TiB 423 MiB 15 GiB >>>> 1.4 TiB 80.38 1.08 98 up osd.66 >>>> 67 hdd 7.27739 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 572 MiB 14 GiB >>>> 2.2 TiB 70.10 0.94 100 up osd.67 >>>> 68 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 TiB 630 MiB 13 GiB >>>> 2.0 TiB 72.88 0.98 107 up osd.68 >>>> 70 hdd 7.27739 1.00000 7.3 TiB 5.4 TiB 5.4 TiB 648 MiB 14 GiB >>>> 1.8 TiB 74.73 1.00 102 up osd.70 >>>> -12 50.99052 - 51 TiB 39 TiB 39 TiB 2.9 GiB 99 GiB >>>> 12 TiB 77.24 1.04 - host s3db11 >>>> 46 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 102 MiB 15 GiB >>>> 1.5 TiB 78.91 1.06 97 up osd.46 >>>> 47 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 61 MiB 13 GiB >>>> 2.1 TiB 71.47 0.96 96 up osd.47 >>>> 48 hdd 7.27739 1.00000 7.3 TiB 6.1 TiB 6.1 TiB 853 MiB 15 GiB >>>> 1.2 TiB 83.46 1.12 109 up osd.48 >>>> 49 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 708 MiB 15 GiB >>>> 1.5 TiB 78.96 1.06 98 up osd.49 >>>> 50 hdd 7.27739 1.00000 7.3 TiB 5.9 TiB 5.8 TiB 472 MiB 15 GiB >>>> 1.4 TiB 80.40 1.08 102 up osd.50 >>>> 51 hdd 7.27739 1.00000 7.3 TiB 5.9 TiB 5.9 TiB 729 MiB 15 GiB >>>> 1.3 TiB 81.70 1.10 110 up osd.51 >>>> 72 hdd 7.32619 1.00000 7.3 TiB 4.8 TiB 4.8 TiB 91 MiB 12 GiB >>>> 2.5 TiB 65.82 0.88 89 up osd.72 >>>> -37 58.55478 - 59 TiB 46 TiB 46 TiB 5.0 GiB 124 GiB >>>> 12 TiB 79.04 1.06 - host s3db12 >>>> 19 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.1 TiB 462 MiB 8.2 GiB >>>> 559 GiB 85.18 1.14 55 up osd.19 >>>> 71 hdd 3.68750 1.00000 3.7 TiB 2.9 TiB 2.8 TiB 3.9 MiB 7.8 GiB >>>> 825 GiB 78.14 1.05 50 up osd.71 >>>> 75 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.1 TiB 576 MiB 8.3 GiB >>>> 555 GiB 85.29 1.14 57 up osd.75 >>>> 76 hdd 3.68750 1.00000 3.7 TiB 3.2 TiB 3.1 TiB 239 MiB 9.3 GiB >>>> 501 GiB 86.73 1.16 50 up osd.76 >>>> 77 hdd 14.60159 1.00000 15 TiB 11 TiB 11 TiB 880 MiB 30 GiB >>>> 3.6 TiB 75.57 1.01 202 up osd.77 >>>> 78 hdd 14.60159 1.00000 15 TiB 11 TiB 11 TiB 1.0 GiB 30 GiB >>>> 3.4 TiB 76.65 1.03 196 up osd.78 >>>> 83 hdd 14.60159 1.00000 15 TiB 12 TiB 12 TiB 1.8 GiB 31 GiB >>>> 2.9 TiB 80.04 1.07 223 up osd.83 >>>> -3 58.49872 - 58 TiB 43 TiB 38 TiB 8.1 GiB 91 GiB >>>> 16 TiB 73.15 0.98 - host s3db2 >>>> 1 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 3.1 GiB 38 GiB >>>> 3.6 TiB 75.52 1.01 194 up osd.1 >>>> 3 hdd 3.63689 1.00000 3.6 TiB 2.2 TiB 1.4 TiB 418 MiB 0 B >>>> 1.4 TiB 60.94 0.82 52 up osd.3 >>>> 4 hdd 3.63689 0.89999 3.6 TiB 3.2 TiB 401 GiB 845 MiB 0 B >>>> 401 GiB 89.23 1.20 53 up osd.4 >>>> 5 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 TiB 437 MiB 0 B >>>> 1.3 TiB 62.88 0.84 51 up osd.5 >>>> 6 hdd 3.63689 1.00000 3.6 TiB 2.0 TiB 1.7 TiB 1.8 GiB 0 B >>>> 1.7 TiB 54.51 0.73 47 up osd.6 >>>> 7 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 493 MiB 26 GiB >>>> 3.8 TiB 73.90 0.99 185 up osd.7 >>>> 74 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.1 GiB 27 GiB >>>> 3.5 TiB 76.27 1.02 208 up osd.74 >>>> -4 58.49872 - 58 TiB 43 TiB 37 TiB 33 GiB 86 GiB >>>> 15 TiB 74.05 0.99 - host s3db3 >>>> 2 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 850 MiB 26 GiB >>>> 4.0 TiB 72.78 0.98 203 up osd.2 >>>> 9 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 8.3 GiB 33 GiB >>>> 3.6 TiB 75.62 1.01 189 up osd.9 >>>> 10 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 663 MiB 28 GiB >>>> 3.5 TiB 76.34 1.02 211 up osd.10 >>>> 12 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.2 TiB 633 MiB 0 B >>>> 1.2 TiB 66.22 0.89 44 up osd.12 >>>> 13 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 720 GiB 2.3 GiB 0 B >>>> 720 GiB 80.66 1.08 66 up osd.13 >>>> 14 hdd 3.63689 1.00000 3.6 TiB 3.1 TiB 552 GiB 18 GiB 0 B >>>> 552 GiB 85.18 1.14 60 up osd.14 >>>> 15 hdd 3.63689 1.00000 3.6 TiB 2.0 TiB 1.7 TiB 2.1 GiB 0 B >>>> 1.7 TiB 53.72 0.72 44 up osd.15 >>>> -5 58.49872 - 58 TiB 45 TiB 37 TiB 7.2 GiB 99 GiB >>>> 14 TiB 76.37 1.02 - host s3db4 >>>> 11 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 897 MiB 28 GiB >>>> 2.8 TiB 81.15 1.09 205 up osd.11 >>>> 17 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.2 GiB 27 GiB >>>> 3.6 TiB 75.38 1.01 211 up osd.17 >>>> 18 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 965 MiB 44 GiB >>>> 4.0 TiB 72.86 0.98 188 up osd.18 >>>> 20 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 796 GiB 529 MiB 0 B >>>> 796 GiB 78.63 1.05 66 up osd.20 >>>> 21 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1.1 TiB 2.1 GiB 0 B >>>> 1.1 TiB 70.32 0.94 47 up osd.21 >>>> 22 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 802 GiB 882 MiB 0 B >>>> 802 GiB 78.47 1.05 58 up osd.22 >>>> 24 hdd 3.63689 1.00000 3.6 TiB 2.8 TiB 856 GiB 645 MiB 0 B >>>> 856 GiB 77.01 1.03 47 up osd.24 >>>> -6 58.89636 - 59 TiB 44 TiB 44 TiB 2.4 GiB 111 GiB >>>> 15 TiB 75.22 1.01 - host s3db5 >>>> 0 hdd 3.73630 1.00000 3.7 TiB 2.4 TiB 2.3 TiB 70 MiB 6.6 GiB >>>> 1.3 TiB 65.00 0.87 48 up osd.0 >>>> 25 hdd 3.73630 1.00000 3.7 TiB 2.4 TiB 2.3 TiB 5.3 MiB 6.6 GiB >>>> 1.4 TiB 63.86 0.86 41 up osd.25 >>>> 26 hdd 3.73630 1.00000 3.7 TiB 2.9 TiB 2.8 TiB 181 MiB 7.6 GiB >>>> 862 GiB 77.47 1.04 48 up osd.26 >>>> 27 hdd 3.73630 1.00000 3.7 TiB 2.3 TiB 2.2 TiB 7.0 MiB 6.1 GiB >>>> 1.5 TiB 61.00 0.82 48 up osd.27 >>>> 28 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 937 MiB 30 GiB >>>> 2.8 TiB 81.19 1.09 203 up osd.28 >>>> 29 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 536 MiB 26 GiB >>>> 3.8 TiB 73.95 0.99 200 up osd.29 >>>> 30 hdd 14.65039 1.00000 15 TiB 12 TiB 11 TiB 744 MiB 28 GiB >>>> 3.1 TiB 79.07 1.06 207 up osd.30 >>>> -7 58.89636 - 59 TiB 44 TiB 44 TiB 14 GiB 122 GiB >>>> 14 TiB 75.41 1.01 - host s3db6 >>>> 32 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 16 MiB 8.2 GiB >>>> 622 GiB 83.74 1.12 65 up osd.32 >>>> 33 hdd 3.73630 0.79999 3.7 TiB 3.0 TiB 2.9 TiB 14 MiB 8.1 GiB >>>> 740 GiB 80.67 1.08 52 up osd.33 >>>> 34 hdd 3.73630 0.79999 3.7 TiB 2.9 TiB 2.8 TiB 449 MiB 7.7 GiB >>>> 877 GiB 77.08 1.03 52 up osd.34 >>>> 35 hdd 3.73630 0.79999 3.7 TiB 2.3 TiB 2.2 TiB 133 MiB 7.0 GiB >>>> 1.4 TiB 62.18 0.83 42 up osd.35 >>>> 36 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 544 MiB 26 GiB >>>> 4.0 TiB 72.98 0.98 220 up osd.36 >>>> 37 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 11 GiB 38 GiB >>>> 3.6 TiB 75.30 1.01 200 up osd.37 >>>> 38 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.2 GiB 28 GiB >>>> 3.3 TiB 77.43 1.04 217 up osd.38 >>>> -8 58.89636 - 59 TiB 47 TiB 46 TiB 3.9 GiB 116 GiB >>>> 12 TiB 78.98 1.06 - host s3db7 >>>> 39 hdd 3.73630 1.00000 3.7 TiB 3.2 TiB 3.2 TiB 19 MiB 8.5 GiB >>>> 499 GiB 86.96 1.17 43 up osd.39 >>>> 40 hdd 3.73630 1.00000 3.7 TiB 2.6 TiB 2.5 TiB 144 MiB 7.0 GiB >>>> 1.2 TiB 68.33 0.92 39 up osd.40 >>>> 41 hdd 3.73630 1.00000 3.7 TiB 3.0 TiB 2.9 TiB 218 MiB 7.9 GiB >>>> 732 GiB 80.86 1.08 64 up osd.41 >>>> 42 hdd 3.73630 1.00000 3.7 TiB 2.5 TiB 2.4 TiB 594 MiB 7.0 GiB >>>> 1.2 TiB 67.97 0.91 50 up osd.42 >>>> 43 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 564 MiB 28 GiB >>>> 2.9 TiB 80.32 1.08 213 up osd.43 >>>> 44 hdd 14.65039 1.00000 15 TiB 12 TiB 11 TiB 1.3 GiB 28 GiB >>>> 3.1 TiB 78.59 1.05 198 up osd.44 >>>> 45 hdd 14.65039 1.00000 15 TiB 12 TiB 12 TiB 1.2 GiB 30 GiB >>>> 2.8 TiB 81.05 1.09 214 up osd.45 >>>> -9 51.28331 - 51 TiB 41 TiB 41 TiB 4.9 GiB 108 GiB >>>> 10 TiB 79.75 1.07 - host s3db8 >>>> 8 hdd 7.32619 1.00000 7.3 TiB 5.8 TiB 5.8 TiB 472 MiB 15 GiB >>>> 1.5 TiB 79.68 1.07 99 up osd.8 >>>> 16 hdd 7.32619 1.00000 7.3 TiB 5.9 TiB 5.8 TiB 785 MiB 15 GiB >>>> 1.4 TiB 80.25 1.08 97 up osd.16 >>>> 31 hdd 7.32619 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 438 MiB 14 GiB >>>> 1.8 TiB 75.36 1.01 87 up osd.31 >>>> 52 hdd 7.32619 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 844 MiB 15 GiB >>>> 1.6 TiB 78.19 1.05 113 up osd.52 >>>> 53 hdd 7.32619 1.00000 7.3 TiB 6.2 TiB 6.1 TiB 792 MiB 18 GiB >>>> 1.1 TiB 84.46 1.13 109 up osd.53 >>>> 54 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 959 MiB 15 GiB >>>> 1.7 TiB 76.73 1.03 115 up osd.54 >>>> 55 hdd 7.32619 1.00000 7.3 TiB 6.1 TiB 6.1 TiB 699 MiB 16 GiB >>>> 1.2 TiB 83.56 1.12 122 up osd.55 >>>> -10 51.28331 - 51 TiB 39 TiB 39 TiB 4.7 GiB 100 GiB >>>> 12 TiB 76.05 1.02 - host s3db9 >>>> 56 hdd 7.32619 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 840 MiB 13 GiB >>>> 2.1 TiB 71.06 0.95 105 up osd.56 >>>> 57 hdd 7.32619 1.00000 7.3 TiB 6.1 TiB 6.0 TiB 1.0 GiB 16 GiB >>>> 1.2 TiB 83.17 1.12 102 up osd.57 >>>> 58 hdd 7.32619 1.00000 7.3 TiB 6.0 TiB 5.9 TiB 43 MiB 15 GiB >>>> 1.4 TiB 81.56 1.09 105 up osd.58 >>>> 59 hdd 7.32619 1.00000 7.3 TiB 5.9 TiB 5.9 TiB 429 MiB 15 GiB >>>> 1.4 TiB 80.64 1.08 94 up osd.59 >>>> 60 hdd 7.32619 1.00000 7.3 TiB 5.4 TiB 5.3 TiB 226 MiB 14 GiB >>>> 2.0 TiB 73.25 0.98 101 up osd.60 >>>> 61 hdd 7.32619 1.00000 7.3 TiB 4.8 TiB 4.8 TiB 1.1 GiB 12 GiB >>>> 2.5 TiB 65.84 0.88 103 up osd.61 >>>> 62 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 1.0 GiB 15 GiB >>>> 1.7 TiB 76.83 1.03 126 up osd.62 >>>> TOTAL 674 TiB 501 TiB 473 TiB 96 GiB 1.2 TiB >>>> 173 TiB 74.57 >>>> MIN/MAX VAR: 0.17/1.20 STDDEV: 10.25 >>>> >>>> >>>> >>>> Am Sa., 13. März 2021 um 15:57 Uhr schrieb Dan van der Ster < >>>> dan@xxxxxxxxxxxxxx>: >>>> >>>>> No, increasing num PGs won't help substantially. >>>>> >>>>> Can you share the entire output of ceph osd df tree ? >>>>> >>>>> Did you already set >>>>> >>>>> ceph config set mgr mgr/balancer/upmap_max_deviation 1 >>>>> >>>>> >>>>> ?? >>>>> And I recommend debug_mgr 4/5 so you can see some basic upmap balancer >>>>> logging. >>>>> >>>>> .. Dan >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Sat, Mar 13, 2021, 3:49 PM Boris Behrens <bb@xxxxxxxxx> wrote: >>>>> >>>>>> Hello people, >>>>>> >>>>>> I am still struggeling with the balancer >>>>>> (https://www.mail-archive.com/ceph-users@xxxxxxx/msg09124.html) >>>>>> Now I've read some more and might think that I do not have enough PGs. >>>>>> Currently I have 84OSDs and 1024PGs for the main pool (3008 total). I >>>>>> have the autoscaler enabled, but I doesn't tell me to increase the >>>>>> PGs. >>>>>> >>>>>> What do you think? >>>>>> >>>>>> -- >>>>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend >>>>>> im groüen Saal. >>>>>> _______________________________________________ >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>>> >>>>> >>>> >>>> -- >>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend >>>> im groüen Saal. >>>> >>> >> >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> groüen Saal. >> > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx