Near Perfect PG distrubtion apart from two OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I recently updated CEPH and rebooted the OSD node's the two OSD's are now even more unbalanced and CEPH is actually currently moving more PG's too the OSD's in question (OSD 7,8), any ideas?



ceph balancer status

{

    "last_optimize_duration": "0:00:00.000289",

    "plans": [],

    "mode": "upmap",

    "active": true,

    "optimize_result": "Too many objects (0.008051 > 0.002000) are misplaced; try again later",

    "last_optimize_started": "Sat Feb  1 12:00:01 2020"

}



 data:

    pools:   3 pools, 613 pgs

    objects: 35.12M objects, 130 TiB

    usage:   183 TiB used, 90 TiB / 273 TiB avail

    pgs:     2826030/351190811 objects misplaced (0.805%)

             580 active+clean

             29  active+remapped+backfill_wait

             4   active+remapped+backfilling



ceph osd df

ID CLASS WEIGHT  REWEIGHT SIZE    RAW USE DATA    OMAP    META     AVAIL   %USE  VAR  PGS STATUS

23   hdd 0.00999  1.00000  10 GiB 1.4 GiB 460 MiB 1.5 MiB 1023 MiB 8.5 GiB 14.50 0.22  33     up

24   hdd 0.00999  1.00000  10 GiB 1.5 GiB 467 MiB  24 KiB 1024 MiB 8.5 GiB 14.57 0.22  34     up

25   hdd 0.00999  1.00000  10 GiB 1.5 GiB 462 MiB  28 KiB 1024 MiB 8.5 GiB 14.52 0.22  34     up

26   hdd 0.00999  1.00000  10 GiB 1.5 GiB 464 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.53 0.22  34     up

27   hdd 0.00999  1.00000  10 GiB 1.5 GiB 464 MiB  40 KiB 1024 MiB 8.5 GiB 14.53 0.22  33     up

28   hdd 0.00999  1.00000  10 GiB 1.5 GiB 462 MiB  12 KiB 1024 MiB 8.5 GiB 14.52 0.22  34     up

 3   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  76 KiB   19 GiB 3.0 TiB 67.06 1.00 170     up

 4   hdd 9.09599  1.00000 9.1 TiB 5.9 TiB 5.9 TiB  32 KiB   18 GiB 3.2 TiB 64.57 0.96 164     up

 5   hdd 9.09599  1.00000 9.1 TiB 6.4 TiB 6.4 TiB  44 KiB   19 GiB 2.7 TiB 70.77 1.06 180     up

 6   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.1 TiB  20 KiB   19 GiB 2.9 TiB 67.65 1.01 171     up

 7   hdd 9.09599  1.00000 9.1 TiB 7.0 TiB 7.0 TiB   8 KiB   21 GiB 2.1 TiB 77.19 1.15 196     up

 8   hdd 9.09599  1.00000 9.1 TiB 6.6 TiB 6.5 TiB  56 KiB   20 GiB 2.5 TiB 72.04 1.07 183     up

 9   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 6.0 TiB  72 KiB   19 GiB 3.1 TiB 66.21 0.99 168     up

10   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.0 TiB   8 KiB   18 GiB 3.0 TiB 66.63 0.99 168     up

11   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  92 KiB   19 GiB 3.0 TiB 67.42 1.01 171     up

12   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB   4 KiB   19 GiB 3.0 TiB 66.92 1.00 169     up

13   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  80 KiB   19 GiB 3.0 TiB 66.80 1.00 169     up

14   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.0 TiB  12 KiB   19 GiB 3.0 TiB 66.62 0.99 169     up

15   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 5.9 TiB  64 KiB   19 GiB 3.1 TiB 65.60 0.98 165     up

16   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  96 KiB   20 GiB 3.0 TiB 67.02 1.00 170     up

17   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 6.0 TiB  44 KiB   19 GiB 3.1 TiB 66.13 0.99 168     up

18   hdd 9.09599  1.00000 9.1 TiB 6.3 TiB 6.3 TiB  12 KiB   20 GiB 2.8 TiB 69.36 1.03 176     up

19   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.2 TiB  60 KiB   19 GiB 2.9 TiB 67.87 1.01 173     up

20   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.1 TiB  48 KiB   19 GiB 2.9 TiB 67.77 1.01 171     up

21   hdd 9.09599  1.00000 9.1 TiB 6.3 TiB 6.3 TiB  52 KiB   20 GiB 2.8 TiB 68.96 1.03 175     up

22   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  16 KiB   19 GiB 3.0 TiB 67.15 1.00 170     up

29   hdd 9.09599  1.00000 9.1 TiB 5.8 TiB 5.8 TiB  80 KiB   18 GiB 3.3 TiB 63.48 0.95 163     up

30   hdd 9.09599  1.00000 9.1 TiB 5.9 TiB 5.9 TiB  40 KiB   18 GiB 3.2 TiB 64.80 0.97 167     up

31   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.2 TiB 152 KiB   19 GiB 2.9 TiB 68.29 1.02 175     up

32   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.0 TiB 128 KiB   18 GiB 3.0 TiB 66.55 0.99 171     up

33   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  48 KiB   19 GiB 3.0 TiB 67.19 1.00 173     up

34   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  60 KiB   18 GiB 3.0 TiB 66.90 1.00 172     up

35   hdd 9.09599  1.00000 9.1 TiB 5.8 TiB 5.8 TiB  52 KiB   18 GiB 3.3 TiB 64.22 0.96 165     up

36   hdd 9.09599  1.00000 9.1 TiB 5.4 TiB 5.4 TiB 128 KiB   17 GiB 3.7 TiB 59.20 0.88 152     up

37   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 6.0 TiB  80 KiB   18 GiB 3.1 TiB 66.10 0.99 170     up

38   hdd 9.09599  1.00000 9.1 TiB 5.9 TiB 5.9 TiB  84 KiB   19 GiB 3.2 TiB 64.65 0.96 166     up

 0   hdd 0.00999  1.00000  10 GiB 1.5 GiB 464 MiB   3 KiB 1024 MiB 8.5 GiB 14.54 0.22  34     up

 1   hdd 0.00999  1.00000  10 GiB 1.4 GiB 460 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.50 0.22  34     up

 2   hdd 0.00999  1.00000  10 GiB 1.5 GiB 466 MiB  24 KiB 1024 MiB 8.5 GiB 14.55 0.22  33     up

23   hdd 0.00999  1.00000  10 GiB 1.4 GiB 460 MiB 1.5 MiB 1023 MiB 8.5 GiB 14.50 0.22  33     up

24   hdd 0.00999  1.00000  10 GiB 1.5 GiB 467 MiB  24 KiB 1024 MiB 8.5 GiB 14.57 0.22  34     up

25   hdd 0.00999  1.00000  10 GiB 1.5 GiB 462 MiB  28 KiB 1024 MiB 8.5 GiB 14.52 0.22  34     up

26   hdd 0.00999  1.00000  10 GiB 1.5 GiB 464 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.53 0.22  34     up

27   hdd 0.00999  1.00000  10 GiB 1.5 GiB 464 MiB  40 KiB 1024 MiB 8.5 GiB 14.53 0.22  33     up

28   hdd 0.00999  1.00000  10 GiB 1.5 GiB 462 MiB  12 KiB 1024 MiB 8.5 GiB 14.52 0.22  34     up

                    TOTAL 273 TiB 183 TiB 182 TiB 6.1 MiB  574 GiB  90 TiB 67.02

MIN/MAX VAR: 0.22/1.15  STDDEV: 30.40



---- On Fri, 10 Jan 2020 14:57:05 +0800 Ashley Merrick <mailto:singapore@xxxxxxxxxxxxxx> wrote ----










Hey,



I have a cluster of 30 OSD's that is near perfect distribution minus two OSD's.



I am running ceph version 14.2.6 however has been the same for the previous versions, I have the balance module enabled in upmap and it says no improvements, I have also tried in crush mode.



ceph balancer status

{

    "last_optimize_duration": "0:00:01.123659",

    "plans": [],

    "mode": "upmap",

    "active": true,

    "optimize_result": "Unable to find further optimization, or pool(s)' pg_num is decreasing, or distribution is already perfect",

    "last_optimize_started": "Fri Jan 10 06:11:08 2020"

}



I have read a few email threads on the ML recently about similar cases but not sure if I am hitting the same "bug" as its only two that are off the rest are almost perfect.



ceph osd df

ID CLASS WEIGHT  REWEIGHT SIZE    RAW USE DATA    OMAP    META     AVAIL   %USE  VAR  PGS STATUS

23   hdd 0.00999  1.00000  10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21  33     up

24   hdd 0.00999  1.00000  10 GiB 1.4 GiB 441 MiB  48 KiB 1024 MiB 8.6 GiB 14.31 0.21  34     up

25   hdd 0.00999  1.00000  10 GiB 1.4 GiB 435 MiB  24 KiB 1024 MiB 8.6 GiB 14.26 0.21  34     up

26   hdd 0.00999  1.00000  10 GiB 1.4 GiB 436 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.27 0.21  34     up

27   hdd 0.00999  1.00000  10 GiB 1.4 GiB 437 MiB  16 KiB 1024 MiB 8.6 GiB 14.27 0.21  33     up

28   hdd 0.00999  1.00000  10 GiB 1.4 GiB 436 MiB  36 KiB 1024 MiB 8.6 GiB 14.26 0.21  34     up

 3   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  76 KiB   19 GiB 3.0 TiB 67.26 1.00 170     up

 4   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.1 TiB  44 KiB   19 GiB 2.9 TiB 67.77 1.01 172     up

 5   hdd 9.09599  1.00000 9.1 TiB 6.3 TiB 6.3 TiB 112 KiB   20 GiB 2.8 TiB 69.50 1.03 176     up

 6   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  17 KiB   19 GiB 2.9 TiB 67.58 1.01 171     up

 7   hdd 9.09599  1.00000 9.1 TiB 6.7 TiB 6.7 TiB  88 KiB   21 GiB 2.4 TiB 73.98 1.10 187     up

 8   hdd 9.09599  1.00000 9.1 TiB 6.5 TiB 6.5 TiB  76 KiB   20 GiB 2.6 TiB 71.84 1.07 182     up

 9   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB 120 KiB   19 GiB 3.0 TiB 67.24 1.00 170     up

10   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  72 KiB   19 GiB 3.0 TiB 67.19 1.00 170     up

11   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.2 TiB  40 KiB   19 GiB 2.9 TiB 68.06 1.01 172     up

12   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  28 KiB   19 GiB 3.0 TiB 67.48 1.00 170     up

13   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  36 KiB   19 GiB 3.0 TiB 67.04 1.00 170     up

14   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB 108 KiB   19 GiB 3.0 TiB 67.30 1.00 170     up

15   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  68 KiB   19 GiB 3.0 TiB 67.41 1.00 170     up

16   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB 152 KiB   19 GiB 2.9 TiB 67.61 1.01 171     up

17   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  36 KiB   19 GiB 3.0 TiB 67.16 1.00 170     up

18   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  41 KiB   19 GiB 3.0 TiB 67.19 1.00 170     up

19   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  64 KiB   19 GiB 3.0 TiB 67.49 1.00 171     up

20   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  12 KiB   19 GiB 3.0 TiB 67.55 1.01 171     up

21   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.1 TiB  76 KiB   19 GiB 2.9 TiB 67.76 1.01 171     up

22   hdd 9.09599  1.00000 9.1 TiB 6.2 TiB 6.2 TiB  12 KiB   19 GiB 2.9 TiB 68.05 1.01 172     up

29   hdd 9.09599  1.00000 9.1 TiB 5.8 TiB 5.8 TiB 108 KiB   17 GiB 3.3 TiB 63.59 0.95 163     up

30   hdd 9.09599  1.00000 9.1 TiB 5.9 TiB 5.9 TiB  24 KiB   18 GiB 3.2 TiB 65.18 0.97 167     up

31   hdd 9.09599  1.00000 9.1 TiB 6.1 TiB 6.1 TiB  44 KiB   18 GiB 3.0 TiB 66.74 0.99 171     up

32   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 6.0 TiB 220 KiB   18 GiB 3.1 TiB 66.31 0.99 170     up

33   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 5.9 TiB  36 KiB   18 GiB 3.1 TiB 65.54 0.98 168     up

34   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 6.0 TiB  44 KiB   18 GiB 3.1 TiB 66.33 0.99 170     up

35   hdd 9.09599  1.00000 9.1 TiB 5.9 TiB 5.9 TiB  68 KiB   18 GiB 3.2 TiB 64.77 0.96 166     up

36   hdd 9.09599  1.00000 9.1 TiB 5.8 TiB 5.8 TiB 168 KiB   17 GiB 3.3 TiB 63.60 0.95 163     up

37   hdd 9.09599  1.00000 9.1 TiB 6.0 TiB 6.0 TiB  60 KiB   18 GiB 3.1 TiB 65.91 0.98 169     up

38   hdd 9.09599  1.00000 9.1 TiB 5.9 TiB 5.9 TiB  68 KiB   18 GiB 3.2 TiB 65.15 0.97 167     up

 0   hdd 0.00999  1.00000  10 GiB 1.4 GiB 437 MiB  28 KiB 1024 MiB 8.6 GiB 14.27 0.21  34     up

 1   hdd 0.00999  1.00000  10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21  34     up

 2   hdd 0.00999  1.00000  10 GiB 1.4 GiB 439 MiB  36 KiB 1024 MiB 8.6 GiB 14.29 0.21  33     up

23   hdd 0.00999  1.00000  10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21  33     up

24   hdd 0.00999  1.00000  10 GiB 1.4 GiB 441 MiB  48 KiB 1024 MiB 8.6 GiB 14.31 0.21  34     up

25   hdd 0.00999  1.00000  10 GiB 1.4 GiB 435 MiB  24 KiB 1024 MiB 8.6 GiB 14.26 0.21  34     up

26   hdd 0.00999  1.00000  10 GiB 1.4 GiB 436 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.27 0.21  34     up

27   hdd 0.00999  1.00000  10 GiB 1.4 GiB 437 MiB  16 KiB 1024 MiB 8.6 GiB 14.27 0.21  33     up

28   hdd 0.00999  1.00000  10 GiB 1.4 GiB 436 MiB  36 KiB 1024 MiB 8.6 GiB 14.26 0.21  34     up

                    TOTAL 273 TiB 183 TiB 183 TiB 6.4 MiB  567 GiB  90 TiB 67.17



ceph osd tree

ID  CLASS WEIGHT    TYPE NAME       STATUS REWEIGHT PRI-AFF

-12         0.05798 root default

-11         0.02899     host sn-m02

 23   hdd   0.00999         osd.23      up  1.00000 1.00000

 24   hdd   0.00999         osd.24      up  1.00000 1.00000

 25   hdd   0.00999         osd.25      up  1.00000 1.00000

-15         0.02899     host sn-m03

 26   hdd   0.00999         osd.26      up  1.00000 1.00000

 27   hdd   0.00999         osd.27      up  1.00000 1.00000

 28   hdd   0.00999         osd.28      up  1.00000 1.00000

 -6       272.87100 root ec

 -5        90.95700     host sn-s01

  3   hdd   9.09599         osd.3       up  1.00000 1.00000

  4   hdd   9.09599         osd.4       up  1.00000 1.00000

  5   hdd   9.09599         osd.5       up  1.00000 1.00000

  6   hdd   9.09599         osd.6       up  1.00000 1.00000

  7   hdd   9.09599         osd.7       up  1.00000 1.00000

  8   hdd   9.09599         osd.8       up  1.00000 1.00000

  9   hdd   9.09599         osd.9       up  1.00000 1.00000

 10   hdd   9.09599         osd.10      up  1.00000 1.00000

 11   hdd   9.09599         osd.11      up  1.00000 1.00000

 12   hdd   9.09599         osd.12      up  1.00000 1.00000

 -9        90.95700     host sn-s02

 13   hdd   9.09599         osd.13      up  1.00000 1.00000

 14   hdd   9.09599         osd.14      up  1.00000 1.00000

 15   hdd   9.09599         osd.15      up  1.00000 1.00000

 16   hdd   9.09599         osd.16      up  1.00000 1.00000

 17   hdd   9.09599         osd.17      up  1.00000 1.00000

 18   hdd   9.09599         osd.18      up  1.00000 1.00000

 19   hdd   9.09599         osd.19      up  1.00000 1.00000

 20   hdd   9.09599         osd.20      up  1.00000 1.00000

 21   hdd   9.09599         osd.21      up  1.00000 1.00000

 22   hdd   9.09599         osd.22      up  1.00000 1.00000

-17        90.95700     host sn-s03

 29   hdd   9.09599         osd.29      up  1.00000 1.00000

 30   hdd   9.09599         osd.30      up  1.00000 1.00000

 31   hdd   9.09599         osd.31      up  1.00000 1.00000

 32   hdd   9.09599         osd.32      up  1.00000 1.00000

 33   hdd   9.09599         osd.33      up  1.00000 1.00000

 34   hdd   9.09599         osd.34      up  1.00000 1.00000

 35   hdd   9.09599         osd.35      up  1.00000 1.00000

 36   hdd   9.09599         osd.36      up  1.00000 1.00000

 37   hdd   9.09599         osd.37      up  1.00000 1.00000

 38   hdd   9.09599         osd.38      up  1.00000 1.00000

 -1         0.08698 root meta

 -3         0.02899     host sn-m01

  0   hdd   0.00999         osd.0       up  1.00000 1.00000

  1   hdd   0.00999         osd.1       up  1.00000 1.00000

  2   hdd   0.00999         osd.2       up  1.00000 1.00000

-11         0.02899     host sn-m02

 23   hdd   0.00999         osd.23      up  1.00000 1.00000

 24   hdd   0.00999         osd.24      up  1.00000 1.00000

 25   hdd   0.00999         osd.25      up  1.00000 1.00000

-15         0.02899     host sn-m03

 26   hdd   0.00999         osd.26      up  1.00000 1.00000

 27   hdd   0.00999         osd.27      up  1.00000 1.00000

 28   hdd   0.00999         osd.28      up  1.00000 1.00000



OSD 7,8 are the issue OSD's sitting at 182,187 PG's where the others are all sitting at 170,171.



Am I hitting the same issue? Or is there something I can do to re balance these extra PG's across the rest of the OSD better?



Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux