So I recently updated CEPH and rebooted the OSD node's the two OSD's are now even more unbalanced and CEPH is actually currently moving more PG's too the OSD's in question (OSD 7,8), any ideas? ceph balancer status { "last_optimize_duration": "0:00:00.000289", "plans": [], "mode": "upmap", "active": true, "optimize_result": "Too many objects (0.008051 > 0.002000) are misplaced; try again later", "last_optimize_started": "Sat Feb 1 12:00:01 2020" } data: pools: 3 pools, 613 pgs objects: 35.12M objects, 130 TiB usage: 183 TiB used, 90 TiB / 273 TiB avail pgs: 2826030/351190811 objects misplaced (0.805%) 580 active+clean 29 active+remapped+backfill_wait 4 active+remapped+backfilling ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 460 MiB 1.5 MiB 1023 MiB 8.5 GiB 14.50 0.22 33 up 24 hdd 0.00999 1.00000 10 GiB 1.5 GiB 467 MiB 24 KiB 1024 MiB 8.5 GiB 14.57 0.22 34 up 25 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 28 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up 26 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.53 0.22 34 up 27 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 40 KiB 1024 MiB 8.5 GiB 14.53 0.22 33 up 28 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 12 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up 3 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 76 KiB 19 GiB 3.0 TiB 67.06 1.00 170 up 4 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 32 KiB 18 GiB 3.2 TiB 64.57 0.96 164 up 5 hdd 9.09599 1.00000 9.1 TiB 6.4 TiB 6.4 TiB 44 KiB 19 GiB 2.7 TiB 70.77 1.06 180 up 6 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 20 KiB 19 GiB 2.9 TiB 67.65 1.01 171 up 7 hdd 9.09599 1.00000 9.1 TiB 7.0 TiB 7.0 TiB 8 KiB 21 GiB 2.1 TiB 77.19 1.15 196 up 8 hdd 9.09599 1.00000 9.1 TiB 6.6 TiB 6.5 TiB 56 KiB 20 GiB 2.5 TiB 72.04 1.07 183 up 9 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 72 KiB 19 GiB 3.1 TiB 66.21 0.99 168 up 10 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.0 TiB 8 KiB 18 GiB 3.0 TiB 66.63 0.99 168 up 11 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 92 KiB 19 GiB 3.0 TiB 67.42 1.01 171 up 12 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 4 KiB 19 GiB 3.0 TiB 66.92 1.00 169 up 13 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 80 KiB 19 GiB 3.0 TiB 66.80 1.00 169 up 14 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.0 TiB 12 KiB 19 GiB 3.0 TiB 66.62 0.99 169 up 15 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 5.9 TiB 64 KiB 19 GiB 3.1 TiB 65.60 0.98 165 up 16 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 96 KiB 20 GiB 3.0 TiB 67.02 1.00 170 up 17 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 44 KiB 19 GiB 3.1 TiB 66.13 0.99 168 up 18 hdd 9.09599 1.00000 9.1 TiB 6.3 TiB 6.3 TiB 12 KiB 20 GiB 2.8 TiB 69.36 1.03 176 up 19 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 60 KiB 19 GiB 2.9 TiB 67.87 1.01 173 up 20 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 48 KiB 19 GiB 2.9 TiB 67.77 1.01 171 up 21 hdd 9.09599 1.00000 9.1 TiB 6.3 TiB 6.3 TiB 52 KiB 20 GiB 2.8 TiB 68.96 1.03 175 up 22 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 16 KiB 19 GiB 3.0 TiB 67.15 1.00 170 up 29 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 80 KiB 18 GiB 3.3 TiB 63.48 0.95 163 up 30 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 40 KiB 18 GiB 3.2 TiB 64.80 0.97 167 up 31 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 152 KiB 19 GiB 2.9 TiB 68.29 1.02 175 up 32 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.0 TiB 128 KiB 18 GiB 3.0 TiB 66.55 0.99 171 up 33 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 48 KiB 19 GiB 3.0 TiB 67.19 1.00 173 up 34 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 60 KiB 18 GiB 3.0 TiB 66.90 1.00 172 up 35 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 52 KiB 18 GiB 3.3 TiB 64.22 0.96 165 up 36 hdd 9.09599 1.00000 9.1 TiB 5.4 TiB 5.4 TiB 128 KiB 17 GiB 3.7 TiB 59.20 0.88 152 up 37 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 80 KiB 18 GiB 3.1 TiB 66.10 0.99 170 up 38 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 84 KiB 19 GiB 3.2 TiB 64.65 0.96 166 up 0 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 3 KiB 1024 MiB 8.5 GiB 14.54 0.22 34 up 1 hdd 0.00999 1.00000 10 GiB 1.4 GiB 460 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.50 0.22 34 up 2 hdd 0.00999 1.00000 10 GiB 1.5 GiB 466 MiB 24 KiB 1024 MiB 8.5 GiB 14.55 0.22 33 up 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 460 MiB 1.5 MiB 1023 MiB 8.5 GiB 14.50 0.22 33 up 24 hdd 0.00999 1.00000 10 GiB 1.5 GiB 467 MiB 24 KiB 1024 MiB 8.5 GiB 14.57 0.22 34 up 25 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 28 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up 26 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.53 0.22 34 up 27 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 40 KiB 1024 MiB 8.5 GiB 14.53 0.22 33 up 28 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 12 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up TOTAL 273 TiB 183 TiB 182 TiB 6.1 MiB 574 GiB 90 TiB 67.02 MIN/MAX VAR: 0.22/1.15 STDDEV: 30.40 ---- On Fri, 10 Jan 2020 14:57:05 +0800 Ashley Merrick <mailto:singapore@xxxxxxxxxxxxxx> wrote ---- Hey, I have a cluster of 30 OSD's that is near perfect distribution minus two OSD's. I am running ceph version 14.2.6 however has been the same for the previous versions, I have the balance module enabled in upmap and it says no improvements, I have also tried in crush mode. ceph balancer status { "last_optimize_duration": "0:00:01.123659", "plans": [], "mode": "upmap", "active": true, "optimize_result": "Unable to find further optimization, or pool(s)' pg_num is decreasing, or distribution is already perfect", "last_optimize_started": "Fri Jan 10 06:11:08 2020" } I have read a few email threads on the ML recently about similar cases but not sure if I am hitting the same "bug" as its only two that are off the rest are almost perfect. ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21 33 up 24 hdd 0.00999 1.00000 10 GiB 1.4 GiB 441 MiB 48 KiB 1024 MiB 8.6 GiB 14.31 0.21 34 up 25 hdd 0.00999 1.00000 10 GiB 1.4 GiB 435 MiB 24 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up 26 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.27 0.21 34 up 27 hdd 0.00999 1.00000 10 GiB 1.4 GiB 437 MiB 16 KiB 1024 MiB 8.6 GiB 14.27 0.21 33 up 28 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 36 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up 3 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 76 KiB 19 GiB 3.0 TiB 67.26 1.00 170 up 4 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 44 KiB 19 GiB 2.9 TiB 67.77 1.01 172 up 5 hdd 9.09599 1.00000 9.1 TiB 6.3 TiB 6.3 TiB 112 KiB 20 GiB 2.8 TiB 69.50 1.03 176 up 6 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 17 KiB 19 GiB 2.9 TiB 67.58 1.01 171 up 7 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 88 KiB 21 GiB 2.4 TiB 73.98 1.10 187 up 8 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.5 TiB 76 KiB 20 GiB 2.6 TiB 71.84 1.07 182 up 9 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 120 KiB 19 GiB 3.0 TiB 67.24 1.00 170 up 10 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 72 KiB 19 GiB 3.0 TiB 67.19 1.00 170 up 11 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 40 KiB 19 GiB 2.9 TiB 68.06 1.01 172 up 12 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 28 KiB 19 GiB 3.0 TiB 67.48 1.00 170 up 13 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 36 KiB 19 GiB 3.0 TiB 67.04 1.00 170 up 14 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 108 KiB 19 GiB 3.0 TiB 67.30 1.00 170 up 15 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 68 KiB 19 GiB 3.0 TiB 67.41 1.00 170 up 16 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 152 KiB 19 GiB 2.9 TiB 67.61 1.01 171 up 17 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 36 KiB 19 GiB 3.0 TiB 67.16 1.00 170 up 18 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 41 KiB 19 GiB 3.0 TiB 67.19 1.00 170 up 19 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 64 KiB 19 GiB 3.0 TiB 67.49 1.00 171 up 20 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 12 KiB 19 GiB 3.0 TiB 67.55 1.01 171 up 21 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 76 KiB 19 GiB 2.9 TiB 67.76 1.01 171 up 22 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 12 KiB 19 GiB 2.9 TiB 68.05 1.01 172 up 29 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 108 KiB 17 GiB 3.3 TiB 63.59 0.95 163 up 30 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 24 KiB 18 GiB 3.2 TiB 65.18 0.97 167 up 31 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 44 KiB 18 GiB 3.0 TiB 66.74 0.99 171 up 32 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 220 KiB 18 GiB 3.1 TiB 66.31 0.99 170 up 33 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 5.9 TiB 36 KiB 18 GiB 3.1 TiB 65.54 0.98 168 up 34 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 44 KiB 18 GiB 3.1 TiB 66.33 0.99 170 up 35 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 68 KiB 18 GiB 3.2 TiB 64.77 0.96 166 up 36 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 168 KiB 17 GiB 3.3 TiB 63.60 0.95 163 up 37 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 60 KiB 18 GiB 3.1 TiB 65.91 0.98 169 up 38 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 68 KiB 18 GiB 3.2 TiB 65.15 0.97 167 up 0 hdd 0.00999 1.00000 10 GiB 1.4 GiB 437 MiB 28 KiB 1024 MiB 8.6 GiB 14.27 0.21 34 up 1 hdd 0.00999 1.00000 10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21 34 up 2 hdd 0.00999 1.00000 10 GiB 1.4 GiB 439 MiB 36 KiB 1024 MiB 8.6 GiB 14.29 0.21 33 up 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21 33 up 24 hdd 0.00999 1.00000 10 GiB 1.4 GiB 441 MiB 48 KiB 1024 MiB 8.6 GiB 14.31 0.21 34 up 25 hdd 0.00999 1.00000 10 GiB 1.4 GiB 435 MiB 24 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up 26 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.27 0.21 34 up 27 hdd 0.00999 1.00000 10 GiB 1.4 GiB 437 MiB 16 KiB 1024 MiB 8.6 GiB 14.27 0.21 33 up 28 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 36 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up TOTAL 273 TiB 183 TiB 183 TiB 6.4 MiB 567 GiB 90 TiB 67.17 ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -12 0.05798 root default -11 0.02899 host sn-m02 23 hdd 0.00999 osd.23 up 1.00000 1.00000 24 hdd 0.00999 osd.24 up 1.00000 1.00000 25 hdd 0.00999 osd.25 up 1.00000 1.00000 -15 0.02899 host sn-m03 26 hdd 0.00999 osd.26 up 1.00000 1.00000 27 hdd 0.00999 osd.27 up 1.00000 1.00000 28 hdd 0.00999 osd.28 up 1.00000 1.00000 -6 272.87100 root ec -5 90.95700 host sn-s01 3 hdd 9.09599 osd.3 up 1.00000 1.00000 4 hdd 9.09599 osd.4 up 1.00000 1.00000 5 hdd 9.09599 osd.5 up 1.00000 1.00000 6 hdd 9.09599 osd.6 up 1.00000 1.00000 7 hdd 9.09599 osd.7 up 1.00000 1.00000 8 hdd 9.09599 osd.8 up 1.00000 1.00000 9 hdd 9.09599 osd.9 up 1.00000 1.00000 10 hdd 9.09599 osd.10 up 1.00000 1.00000 11 hdd 9.09599 osd.11 up 1.00000 1.00000 12 hdd 9.09599 osd.12 up 1.00000 1.00000 -9 90.95700 host sn-s02 13 hdd 9.09599 osd.13 up 1.00000 1.00000 14 hdd 9.09599 osd.14 up 1.00000 1.00000 15 hdd 9.09599 osd.15 up 1.00000 1.00000 16 hdd 9.09599 osd.16 up 1.00000 1.00000 17 hdd 9.09599 osd.17 up 1.00000 1.00000 18 hdd 9.09599 osd.18 up 1.00000 1.00000 19 hdd 9.09599 osd.19 up 1.00000 1.00000 20 hdd 9.09599 osd.20 up 1.00000 1.00000 21 hdd 9.09599 osd.21 up 1.00000 1.00000 22 hdd 9.09599 osd.22 up 1.00000 1.00000 -17 90.95700 host sn-s03 29 hdd 9.09599 osd.29 up 1.00000 1.00000 30 hdd 9.09599 osd.30 up 1.00000 1.00000 31 hdd 9.09599 osd.31 up 1.00000 1.00000 32 hdd 9.09599 osd.32 up 1.00000 1.00000 33 hdd 9.09599 osd.33 up 1.00000 1.00000 34 hdd 9.09599 osd.34 up 1.00000 1.00000 35 hdd 9.09599 osd.35 up 1.00000 1.00000 36 hdd 9.09599 osd.36 up 1.00000 1.00000 37 hdd 9.09599 osd.37 up 1.00000 1.00000 38 hdd 9.09599 osd.38 up 1.00000 1.00000 -1 0.08698 root meta -3 0.02899 host sn-m01 0 hdd 0.00999 osd.0 up 1.00000 1.00000 1 hdd 0.00999 osd.1 up 1.00000 1.00000 2 hdd 0.00999 osd.2 up 1.00000 1.00000 -11 0.02899 host sn-m02 23 hdd 0.00999 osd.23 up 1.00000 1.00000 24 hdd 0.00999 osd.24 up 1.00000 1.00000 25 hdd 0.00999 osd.25 up 1.00000 1.00000 -15 0.02899 host sn-m03 26 hdd 0.00999 osd.26 up 1.00000 1.00000 27 hdd 0.00999 osd.27 up 1.00000 1.00000 28 hdd 0.00999 osd.28 up 1.00000 1.00000 OSD 7,8 are the issue OSD's sitting at 182,187 PG's where the others are all sitting at 170,171. Am I hitting the same issue? Or is there something I can do to re balance these extra PG's across the rest of the OSD better? Thanks _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx