Re: unbalanced OSDs

Eugen Block <eblock@xxxxxx> · Thu, 03 Aug 2023 09:11:55 +0000

Turn off the autoscaler and increase pg_num to 512 or so (power of 2).  
The recommendation is to have between 100 and 150 PGs per OSD (incl.  
replicas). And then let the balancer handle the rest. What is the  
current balancer status (ceph balancer status)?

Zitat von Spiros Papageorgiou <papage@xxxxxxxxxxx>:

Hi all,

I have a ceph cluster with 3 nodes. ceph version is 16.2.9. There  
are 7 SSD OSDs on each server and one pool that resides on these OSDs.

My OSDs are terribly unbalanced:

ID  CLASS  WEIGHT    REWEIGHT  SIZE RAW USE  DATA     OMAP      
META      AVAIL    %USE   VAR   PGS STATUS  TYPE NAME
-9         28.42200         -   28 TiB  9.3 TiB  9.2 TiB  161 MiB     
26 GiB   19 TiB  32.56  1.09    -          root ssddisks
-2          9.47400         -  9.5 TiB  3.4 TiB  3.4 TiB   66 MiB    
9.2 GiB  6.1 TiB  35.52  1.19    -              host px1-ssd
 0    ssd   1.74599   0.85004  1.7 TiB  810 GiB  807 GiB  3.2 MiB    
2.3 GiB  978 GiB  45.28  1.51   26      up          osd.0
 5    ssd   0.82999   0.85004  850 GiB  581 GiB  580 GiB   22 MiB    
912 MiB  269 GiB  68.38  2.29   19      up          osd.5
 6    ssd   0.82999   1.00000  850 GiB  8.2 GiB  7.8 GiB  9.5 MiB    
435 MiB  842 GiB   0.97  0.03    4      up          osd.6
 7    ssd   0.82999   1.00000  850 GiB  294 GiB  293 GiB   26 MiB    
591 MiB  556 GiB  34.60  1.16   11      up          osd.7
16    ssd   1.74599   0.85004  1.7 TiB  872 GiB  869 GiB  3.1 MiB    
2.3 GiB  916 GiB  48.75  1.63   27      up          osd.16
23    ssd   1.74599   1.00000  1.7 TiB  438 GiB  436 GiB  1.5 MiB    
1.7 GiB  1.3 TiB  24.48  0.82   14      up          osd.23
24    ssd   1.74599   1.00000  1.7 TiB  444 GiB  443 GiB  1.6 MiB    
1.0 GiB  1.3 TiB  24.81  0.83   17      up          osd.24
-6          9.47400         -  9.5 TiB  2.9 TiB  2.9 TiB   46 MiB    
8.1 GiB  6.6 TiB  30.39  1.02    -              host px2-ssd
12    ssd   0.82999   1.00000  850 GiB  154 GiB  154 GiB   21 MiB    
368 MiB  696 GiB  18.16  0.61    9      up          osd.12
13    ssd   0.82999   1.00000  850 GiB  144 GiB  143 GiB  527 KiB    
469 MiB  706 GiB  16.92  0.57    4      up          osd.13
14    ssd   0.82999   1.00000  850 GiB  149 GiB  149 GiB   16 MiB    
299 MiB  700 GiB  17.58  0.59    7      up          osd.14
29    ssd   1.74599   1.00000  1.7 TiB  449 GiB  448 GiB  1.6 MiB    
1.4 GiB  1.3 TiB  25.11  0.84   20      up          osd.29
30    ssd   1.74599   0.85004  1.7 TiB  885 GiB  882 GiB  3.1 MiB    
2.3 GiB  903 GiB  49.48  1.65   31      up          osd.30
31    ssd   1.74599   1.00000  1.7 TiB  728 GiB  727 GiB  2.6 MiB    
1.8 GiB  1.0 TiB  40.74  1.36   22      up          osd.31
32    ssd   1.74599   1.00000  1.7 TiB  438 GiB  437 GiB  1.6 MiB    
1.4 GiB  1.3 TiB  24.51  0.82   15      up          osd.32
-4          9.47400         -  9.5 TiB  3.0 TiB  3.0 TiB   49 MiB    
8.7 GiB  6.5 TiB  31.78  1.06    -              host px3-ssd
19    ssd   0.82999   1.00000  850 GiB  293 GiB  292 GiB   14 MiB    
500 MiB  557 GiB  34.47  1.15    9      up          osd.19
20    ssd   0.82999   1.00000  850 GiB  290 GiB  290 GiB   10 MiB    
482 MiB  560 GiB  34.15  1.14   10      up          osd.20
21    ssd   0.82999   1.00000  850 GiB  148 GiB  147 GiB   16 MiB    
428 MiB  702 GiB  17.36  0.58    5      up          osd.21
25    ssd   1.74599   1.00000  1.7 TiB  446 GiB  445 GiB  1.8 MiB    
1.6 GiB  1.3 TiB  24.96  0.83   19      up          osd.25
26    ssd   1.74599   1.00000  1.7 TiB  739 GiB  737 GiB  2.6 MiB    
2.0 GiB  1.0 TiB  41.33  1.38   29      up          osd.26
27    ssd   1.74599   1.00000  1.7 TiB  725 GiB  723 GiB  2.6 MiB    
2.1 GiB  1.0 TiB  40.55  1.36   21      up          osd.27
28    ssd   1.74599   1.00000  1.7 TiB  442 GiB  440 GiB  1.6 MiB    
1.7 GiB  1.3 TiB  24.72  0.83   17      up          osd.28

I have done a "ceph osd reweight-by-utilization" and "ceph osd  
set-require-min-compat-client luminous". The pool has 32 PGs which  
were set by autoscale_mode, which is on.

Why are my OSDs, so unbalanced? I have osd.5 with 68.3% and osd.6  
with 0.97%....  Also when the reweight-by-utilization, osd.5  
utilization actually increased...

What am i missing here?

Sp

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx