Hi,
Thanks for your advices. Here is the output of ceph osd df tree :
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
AVAIL %USE VAR PGS STATUS TYPE NAME
-1 1018.65833 - 466 TiB 214 TiB 213 TiB 117 GiB 605 GiB
252 TiB 0 0 - root default
-15 465.66577 - 466 TiB 214 TiB 213 TiB 117 GiB 605 GiB
252 TiB 45.88 1.06 - room 1222-2-10
-3 116.41678 - 116 TiB 52 TiB 52 TiB 24 GiB 153 GiB
64 TiB 44.91 1.04 - host lpnceph01
0 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 2.5 GiB 16 GiB
3.7 TiB 49.31 1.14 35 up osd.0
4 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.1 TiB 2.4 GiB 8.5 GiB
4.1 TiB 43.39 1.00 35 up osd.4
8 hdd 7.27699 1.00000 7.3 TiB 3.1 TiB 3.1 TiB 2.3 GiB 9.1 GiB
4.1 TiB 43.23 1.00 33 up osd.8
12 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 2.4 GiB 8.8 GiB
4.3 TiB 40.85 0.94 32 up osd.12
16 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 40 MiB 9.7 GiB
3.8 TiB 47.95 1.11 36 up osd.16
20 hdd 7.27599 1.00000 7.3 TiB 2.8 TiB 2.8 TiB 2.4 GiB 8.3 GiB
4.5 TiB 38.00 0.88 33 up osd.20
24 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 2.3 GiB 10 GiB
3.6 TiB 49.98 1.15 37 up osd.24
28 hdd 7.27599 1.00000 7.3 TiB 2.6 TiB 2.6 TiB 50 MiB 8.3 GiB
4.7 TiB 35.39 0.82 26 up osd.28
32 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 31 MiB 9.3 GiB
4.1 TiB 44.21 1.02 32 up osd.32
36 hdd 7.27599 1.00000 7.3 TiB 4.2 TiB 4.2 TiB 2.6 GiB 11 GiB
3.1 TiB 57.79 1.33 37 up osd.36
40 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.4 GiB 9.1 GiB
3.8 TiB 47.84 1.10 42 up osd.40
44 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.3 GiB 9.2 GiB
3.8 TiB 48.44 1.12 39 up osd.44
48 hdd 7.27599 1.00000 7.3 TiB 3.1 TiB 3.0 TiB 91 MiB 9.0 GiB
4.2 TiB 41.93 0.97 30 up osd.48
52 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.4 GiB 9.7 GiB
3.8 TiB 47.59 1.10 33 up osd.52
56 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 23 MiB 8.2 GiB
4.2 TiB 41.88 0.97 42 up osd.56
60 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 38 MiB 8.3 GiB
4.3 TiB 40.76 0.94 29 up osd.60
-5 116.41600 - 116 TiB 54 TiB 53 TiB 28 GiB 150 GiB
63 TiB 46.02 1.06 - host lpnceph02
1 hdd 7.27599 1.00000 7.3 TiB 2.9 TiB 2.9 TiB 26 MiB 8.0 GiB
4.4 TiB 40.19 0.93 34 up osd.1
5 hdd 7.27599 1.00000 7.3 TiB 2.7 TiB 2.7 TiB 26 MiB 7.9 GiB
4.6 TiB 36.92 0.85 26 up osd.5
9 hdd 7.27599 1.00000 7.3 TiB 4.0 TiB 4.0 TiB 42 MiB 11 GiB
3.3 TiB 54.44 1.26 38 up osd.9
13 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 2.3 GiB 9.6 GiB
4.3 TiB 41.47 0.96 37 up osd.13
17 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 2.3 GiB 9.4 GiB
3.9 TiB 46.79 1.08 37 up osd.17
21 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 41 MiB 9.2 GiB
4.1 TiB 44.18 1.02 30 up osd.21
25 hdd 7.27599 1.00000 7.3 TiB 3.7 TiB 3.7 TiB 2.4 GiB 10 GiB
3.5 TiB 51.33 1.19 41 up osd.25
29 hdd 7.27599 1.00000 7.3 TiB 3.1 TiB 3.1 TiB 2.4 GiB 8.7 GiB
4.2 TiB 42.14 0.97 35 up osd.29
33 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.3 GiB 9.4 GiB
3.8 TiB 48.01 1.11 39 up osd.33
37 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 4.5 GiB 9.8 GiB
4.0 TiB 44.57 1.03 30 up osd.37
41 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.8 TiB 2.2 GiB 11 GiB
3.5 TiB 52.50 1.21 36 up osd.41
45 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 2.4 GiB 9.4 GiB
3.9 TiB 46.87 1.08 36 up osd.45
49 hdd 7.27599 1.00000 7.3 TiB 3.3 TiB 3.3 TiB 2.3 GiB 9.0 GiB
4.0 TiB 45.39 1.05 39 up osd.49
53 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 37 MiB 8.9 GiB
4.1 TiB 43.80 1.01 31 up osd.53
57 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 2.3 GiB 9.2 GiB
3.9 TiB 47.01 1.09 38 up osd.57
61 hdd 7.27599 1.00000 7.3 TiB 3.7 TiB 3.7 TiB 2.4 GiB 9.8 GiB
3.6 TiB 50.64 1.17 36 up osd.61
-9 116.41600 - 116 TiB 56 TiB 56 TiB 31 GiB 158 GiB
60 TiB 48.12 1.11 - host lpnceph04
7 hdd 7.27599 1.00000 7.3 TiB 3.3 TiB 3.3 TiB 2.4 GiB 9.2 GiB
3.9 TiB 45.74 1.06 34 up osd.7
11 hdd 7.27599 1.00000 7.3 TiB 3.9 TiB 3.9 TiB 7.1 GiB 11 GiB
3.4 TiB 53.24 1.23 39 up osd.11
15 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 43 MiB 9.3 GiB
3.7 TiB 48.54 1.12 38 up osd.15
27 hdd 7.27599 1.00000 7.3 TiB 2.9 TiB 2.9 TiB 2.3 GiB 8.2 GiB
4.4 TiB 39.91 0.92 33 up osd.27
31 hdd 7.27599 1.00000 7.3 TiB 2.8 TiB 2.8 TiB 24 MiB 8.1 GiB
4.4 TiB 39.16 0.90 34 up osd.31
35 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.7 TiB 2.3 GiB 13 GiB
3.5 TiB 51.71 1.19 40 up osd.35
39 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.7 TiB 65 MiB 13 GiB
3.5 TiB 51.65 1.19 35 up osd.39
43 hdd 7.27599 1.00000 7.3 TiB 3.3 TiB 3.3 TiB 2.4 GiB 9.4 GiB
4.0 TiB 45.69 1.06 35 up osd.43
47 hdd 7.27599 1.00000 7.3 TiB 3.9 TiB 3.8 TiB 4.7 GiB 10 GiB
3.4 TiB 52.99 1.22 44 up osd.47
51 hdd 7.27599 1.00000 7.3 TiB 3.9 TiB 3.9 TiB 41 MiB 10 GiB
3.4 TiB 53.75 1.24 40 up osd.51
55 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 7.1 GiB 9.2 GiB
4.0 TiB 44.37 1.02 36 up osd.55
59 hdd 7.27599 1.00000 7.3 TiB 3.7 TiB 3.7 TiB 43 MiB 9.9 GiB
3.5 TiB 51.46 1.19 38 up osd.59
100 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 2.3 GiB 10 GiB
3.7 TiB 49.65 1.15 36 up osd.100
101 hdd 7.27599 1.00000 7.3 TiB 3.3 TiB 3.3 TiB 46 MiB 9.0 GiB
4.0 TiB 45.14 1.04 38 up osd.101
102 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 75 MiB 9.0 GiB
3.9 TiB 46.45 1.07 37 up osd.102
105 hdd 7.27599 1.00000 7.3 TiB 3.7 TiB 3.7 TiB 58 MiB 9.7 GiB
3.6 TiB 50.50 1.17 40 up osd.105
-13 116.41699 - 116 TiB 52 TiB 52 TiB 33 GiB 144 GiB
65 TiB 44.49 1.03 - host lpnceph06
19 hdd 7.27699 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 25 MiB 8.5 GiB
4.1 TiB 43.47 1.00 40 up osd.19
72 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.6 GiB 9.0 GiB
3.8 TiB 47.61 1.10 34 up osd.72
74 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 51 MiB 8.7 GiB
4.3 TiB 40.92 0.95 28 up osd.74
75 hdd 7.27599 1.00000 7.3 TiB 2.8 TiB 2.8 TiB 2.4 GiB 7.8 GiB
4.5 TiB 38.48 0.89 35 up osd.75
76 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 4.7 GiB 8.7 GiB
4.3 TiB 40.87 0.94 33 up osd.76
77 hdd 7.27599 1.00000 7.3 TiB 3.2 TiB 3.2 TiB 2.5 GiB 8.7 GiB
4.1 TiB 43.70 1.01 38 up osd.77
78 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 4.7 GiB 12 GiB
3.9 TiB 46.52 1.07 36 up osd.78
79 hdd 7.27599 1.00000 7.3 TiB 2.8 TiB 2.8 TiB 2.4 GiB 8.0 GiB
4.5 TiB 38.58 0.89 35 up osd.79
80 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 2.4 GiB 8.9 GiB
3.9 TiB 46.74 1.08 38 up osd.80
81 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.7 TiB 36 MiB 9.1 GiB
3.5 TiB 51.61 1.19 39 up osd.81
82 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 30 MiB 9.1 GiB
3.8 TiB 48.03 1.11 39 up osd.82
83 hdd 7.27599 1.00000 7.3 TiB 2.9 TiB 2.9 TiB 39 MiB 8.3 GiB
4.3 TiB 40.34 0.93 33 up osd.83
84 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.8 TiB 4.6 GiB 10 GiB
3.5 TiB 52.00 1.20 42 up osd.84
85 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 3.0 TiB 2.4 GiB 8.7 GiB
4.3 TiB 41.50 0.96 32 up osd.85
86 hdd 7.27599 1.00000 7.3 TiB 2.8 TiB 2.8 TiB 49 MiB 8.3 GiB
4.4 TiB 39.13 0.90 28 up osd.86
87 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.8 TiB 4.6 GiB 10 GiB
3.5 TiB 52.37 1.21 40 up osd.87
-16 552.99255 - 349 TiB 122 TiB 122 TiB 56 GiB 304 GiB
227 TiB 0 0 - room 1222-SS-09
-21 0 - 0 B 0 B 0 B 0 B 0
B 0 B 0 0 - host lpnceph00
-7 116.41600 - 116 TiB 61 TiB 60 TiB 33 GiB 171 GiB
56 TiB 52.08 1.20 - host lpnceph03
2 hdd 7.27599 1.00000 7.3 TiB 2.9 TiB 2.9 TiB 34 MiB 8.2 GiB
4.4 TiB 40.02 0.92 31 up osd.2
6 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 4.8 GiB 16 GiB
3.8 TiB 48.06 1.11 39 up osd.6
10 hdd 7.27599 1.00000 7.3 TiB 4.3 TiB 4.2 TiB 2.4 GiB 13 GiB
3.0 TiB 58.60 1.35 46 up osd.10
14 hdd 7.27599 1.00000 7.3 TiB 4.1 TiB 4.0 TiB 51 MiB 12 GiB
3.2 TiB 55.79 1.29 40 up osd.14
18 hdd 7.27599 1.00000 7.3 TiB 4.2 TiB 4.1 TiB 4.8 GiB 11 GiB
3.1 TiB 57.11 1.32 43 up osd.18
22 hdd 7.27599 1.00000 7.3 TiB 4.5 TiB 4.5 TiB 41 MiB 12 GiB
2.8 TiB 61.92 1.43 46 up osd.22
26 hdd 7.27599 1.00000 7.3 TiB 4.0 TiB 4.0 TiB 2.3 GiB 11 GiB
3.3 TiB 54.48 1.26 40 up osd.26
30 hdd 7.27599 1.00000 7.3 TiB 4.1 TiB 4.1 TiB 54 MiB 11 GiB
3.2 TiB 55.81 1.29 39 up osd.30
34 hdd 7.27599 1.00000 7.3 TiB 4.4 TiB 4.4 TiB 59 MiB 11 GiB
2.9 TiB 60.79 1.40 44 up osd.34
38 hdd 7.27599 1.00000 7.3 TiB 3.3 TiB 3.3 TiB 37 MiB 8.8 GiB
4.0 TiB 45.71 1.06 37 up osd.38
42 hdd 7.27599 1.00000 7.3 TiB 4.1 TiB 4.1 TiB 4.7 GiB 11 GiB
3.2 TiB 55.92 1.29 47 up osd.42
46 hdd 7.27599 1.00000 7.3 TiB 3.1 TiB 3.1 TiB 4.7 GiB 8.5 GiB
4.1 TiB 43.08 0.99 37 up osd.46
50 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 4.7 GiB 9.7 GiB
3.7 TiB 49.12 1.13 44 up osd.50
54 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 54 MiB 9.2 GiB
3.9 TiB 46.24 1.07 35 up osd.54
58 hdd 7.27599 1.00000 7.3 TiB 3.8 TiB 3.8 TiB 2.3 GiB 9.9 GiB
3.5 TiB 52.11 1.20 40 up osd.58
62 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.4 GiB 9.6 GiB
3.7 TiB 48.53 1.12 42 up osd.62
-11 0 - 0 B 0 B 0 B 0 B 0
B 0 B 0 0 - host lpnceph05
-19 87.31200 - 87 TiB 45 TiB 45 TiB 26 GiB 122 GiB
42 TiB 51.40 1.19 - host lpnceph07
89 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 4.7 GiB 9.6 GiB
3.8 TiB 47.31 1.09 38 up osd.89
90 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 2.4 GiB 9.6 GiB
3.8 TiB 48.11 1.11 39 up osd.90
91 hdd 7.27599 1.00000 7.3 TiB 4.4 TiB 4.4 TiB 37 MiB 11 GiB
2.9 TiB 60.42 1.40 48 up osd.91
92 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 7.2 GiB 10 GiB
3.7 TiB 49.36 1.14 39 up osd.92
93 hdd 7.27599 1.00000 7.3 TiB 3.5 TiB 3.5 TiB 43 MiB 8.9 GiB
3.8 TiB 47.75 1.10 38 up osd.93
94 hdd 7.27599 1.00000 7.3 TiB 3.4 TiB 3.4 TiB 35 MiB 9.3 GiB
3.9 TiB 46.55 1.08 34 up osd.94
95 hdd 7.27599 1.00000 7.3 TiB 4.1 TiB 4.1 TiB 33 MiB 12 GiB
3.2 TiB 56.30 1.30 43 up osd.95
97 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 35 MiB 9.6 GiB
3.7 TiB 48.99 1.13 39 up osd.97
98 hdd 7.27599 1.00000 7.3 TiB 3.0 TiB 2.9 TiB 4.6 GiB 8.5 GiB
4.3 TiB 40.63 0.94 34 up osd.98
99 hdd 7.27599 1.00000 7.3 TiB 3.9 TiB 3.8 TiB 4.6 GiB 10 GiB
3.4 TiB 52.98 1.22 43 up osd.99
103 hdd 7.27599 1.00000 7.3 TiB 3.6 TiB 3.6 TiB 2.3 GiB 9.5 GiB
3.7 TiB 49.16 1.14 41 up osd.103
104 hdd 7.27599 1.00000 7.3 TiB 5.0 TiB 5.0 TiB 55 MiB 13 GiB
2.2 TiB 69.22 1.60 52 up osd.104
-23 349.26453 - 349 TiB 122 TiB 122 TiB 56 GiB 304 GiB
227 TiB 34.89 0.81 - host lpnceph09
3 hdd 14.55269 1.00000 15 TiB 5.4 TiB 5.4 TiB 4.6 GiB 13 GiB
9.2 TiB 37.03 0.86 61 up osd.3
23 hdd 14.55269 1.00000 15 TiB 4.8 TiB 4.8 TiB 41 MiB 12 GiB
9.7 TiB 33.28 0.77 50 up osd.23
63 hdd 14.55269 1.00000 15 TiB 5.1 TiB 5.1 TiB 2.4 GiB 13 GiB
9.5 TiB 34.85 0.80 48 up osd.63
64 hdd 14.55269 1.00000 15 TiB 5.2 TiB 5.2 TiB 6.8 GiB 13 GiB
9.3 TiB 35.79 0.83 58 up osd.64
65 hdd 14.55269 1.00000 15 TiB 6.2 TiB 6.2 TiB 4.6 GiB 15 GiB
8.3 TiB 42.62 0.98 62 up osd.65
66 hdd 14.55269 1.00000 15 TiB 4.8 TiB 4.8 TiB 2.4 GiB 12 GiB
9.7 TiB 33.09 0.76 52 up osd.66
67 hdd 14.55269 1.00000 15 TiB 5.1 TiB 5.1 TiB 41 MiB 13 GiB
9.5 TiB 35.04 0.81 54 up osd.67
68 hdd 14.55269 1.00000 15 TiB 5.3 TiB 5.3 TiB 57 MiB 13 GiB
9.2 TiB 36.68 0.85 55 up osd.68
69 hdd 14.55269 1.00000 15 TiB 5.9 TiB 5.8 TiB 53 MiB 15 GiB
8.7 TiB 40.27 0.93 56 up osd.69
70 hdd 14.55269 1.00000 15 TiB 4.4 TiB 4.4 TiB 4.7 GiB 12 GiB
10 TiB 30.51 0.70 50 up osd.70
71 hdd 14.55269 1.00000 15 TiB 5.0 TiB 5.0 TiB 46 MiB 12 GiB
9.6 TiB 34.11 0.79 55 up osd.71
73 hdd 14.55269 1.00000 15 TiB 4.9 TiB 4.9 TiB 2.3 GiB 12 GiB
9.7 TiB 33.49 0.77 54 up osd.73
88 hdd 14.55269 1.00000 15 TiB 5.0 TiB 5.0 TiB 40 MiB 12 GiB
9.5 TiB 34.42 0.80 50 up osd.88
96 hdd 14.55269 1.00000 15 TiB 5.3 TiB 5.2 TiB 2.3 GiB 13 GiB
9.3 TiB 36.17 0.84 54 up osd.96
106 hdd 14.55269 1.00000 15 TiB 4.8 TiB 4.8 TiB 2.5 GiB 12 GiB
9.7 TiB 33.17 0.77 54 up osd.106
107 hdd 14.55269 1.00000 15 TiB 5.2 TiB 5.2 TiB 4.8 GiB 14 GiB
9.4 TiB 35.65 0.82 52 up osd.107
108 hdd 14.55269 1.00000 15 TiB 4.9 TiB 4.9 TiB 42 MiB 12 GiB
9.6 TiB 33.70 0.78 50 up osd.108
109 hdd 14.55269 1.00000 15 TiB 5.1 TiB 5.0 TiB 51 MiB 12 GiB
9.5 TiB 34.79 0.80 50 up osd.109
110 hdd 14.55269 1.00000 15 TiB 5.7 TiB 5.6 TiB 4.5 GiB 14 GiB
8.9 TiB 38.94 0.90 59 up osd.110
111 hdd 14.55269 1.00000 15 TiB 5.0 TiB 4.9 TiB 4.5 GiB 12 GiB
9.6 TiB 34.12 0.79 52 up osd.111
112 hdd 14.55269 1.00000 15 TiB 4.8 TiB 4.7 TiB 4.6 GiB 12 GiB
9.8 TiB 32.65 0.75 55 up osd.112
113 hdd 14.55269 1.00000 15 TiB 4.6 TiB 4.6 TiB 46 MiB 11 GiB
10 TiB 31.52 0.73 50 up osd.113
114 hdd 14.55269 1.00000 15 TiB 5.0 TiB 5.0 TiB 4.6 GiB 13 GiB
9.6 TiB 34.33 0.79 55 up osd.114
115 hdd 14.55269 1.00000 15 TiB 4.5 TiB 4.5 TiB 33 MiB 10 GiB
10 TiB 31.17 0.72 49 up osd.115
TOTAL 1019 TiB 441 TiB 440 TiB 232 GiB 1.2 TiB
578 TiB 43.30
MIN/MAX VAR: 0.70/1.60 STDDEV: 7.85
I set the stdev as you suggested, and after a few minutes it seems that
it triggers some remapping and the ceph balancer status now shows
{
"last_optimize_duration": "0:00:00.012949",
"plans": [],
"mode": "upmap",
"active": true,
"optimize_result": "Optimization plan created successfully",
"last_optimize_started": "Sat Jan 30 09:36:15 2021"
}
F.
Le 29/01/2021 à 23:44, Dan van der Ster a écrit :
Thanks, and thanks for the log file OTR which simply showed:
2021-01-29 23:17:32.567 7f6155cae700 4 mgr[balancer] prepared 0/10 changes
This indeed means that balancer believes those pools are all balanced
according to the config (which you have set to the defaults).
Could you please also share the output of `ceph osd df tree` so we can
see the distribution and OSD weights?
You might need simply to decrease the upmap_max_deviation from the
default of 5. On our clusters we do:
ceph config set mgr mgr/balancer/upmap_max_deviation 1
Cheers, Dan
On Fri, Jan 29, 2021 at 11:25 PM Francois Legrand <fleg@xxxxxxxxxxxxxx> wrote:
Hi Dan,
Here is the output of ceph balancer status :
/ceph balancer status//
//{//
// "last_optimize_duration": "0:00:00.074965", //
// "plans": [], //
// "mode": "upmap", //
// "active": true, //
// "optimize_result": "Unable to find further optimization, or
pool(s) pg_num is decreasing, or distribution is already perfect", //
// "last_optimize_started": "Fri Jan 29 23:13:31 2021"//
//}/
F.
Le 29/01/2021 à 10:57, Dan van der Ster a écrit :
Hi Francois,
What is the output of `ceph balancer status` ?
Also, can you increase the debug_mgr to 4/5 then share the log file of
the active mgr?
Best,
Dan
On Fri, Jan 29, 2021 at 10:54 AM Francois Legrand <fleg@xxxxxxxxxxxxxx> wrote:
Thanks for your suggestion. I will have a look !
But I am a bit surprised that the "official" balancer seems so unefficient !
F.
Le 28/01/2021 à 12:00, Jonas Jelten a écrit :
Hi!
We also suffer heavily from this so I wrote a custom balancer which yields much better results:
https://github.com/TheJJ/ceph-balancer
After you run it, it echoes the PG movements it suggests. You can then just run those commands the cluster will balance more.
It's kinda work in progress, so I'm glad about your feedback.
Maybe it helps you :)
-- Jonas
On 27/01/2021 17.15, Francois Legrand wrote:
Hi all,
I have a cluster with 116 disks (24 new disks of 16TB added in december and the rest of 8TB) running nautilus 14.2.16.
I moved (8 month ago) from crush_compat to upmap balancing.
But the cluster seems not well balanced, with a number of pgs on the 8TB disks varying from 26 to 52 ! And an occupation from 35 to 69%.
The recent 16 TB disks are more homogeneous with 48 to 61 pgs and space between 30 and 43%.
Last week, I realized that some osd were maybe not using upmap because I did a ceph osd crush weight-set ls and got (compat) as result.
Thus I ran a ceph osd crush weight-set rm-compat which triggered some rebalancing. Now there is no more recovery for 2 days, but the cluster is still unbalanced.
As far as I understand, upmap is supposed to reach an equal number of pgs on all the disks (I guess weighted by their capacity).
Thus I would expect more or less 30 pgs on the 8TB disks and 60 on the 16TB and around 50% usage on all. Which is not the case (by far).
The problem is that it impact the free available space in the pools (264Ti while there is more than 578Ti free in the cluster) because free space seems to be based on space available before the first osd will be full !
Is it normal ? Did I missed something ? What could I do ?
F.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx