Octopus auto-scale causing HEALTH_WARN re object numbers

Matthew Vernon <mv3@xxxxxxxxxxxx> · Tue, 2 Mar 2021 16:38:51 +0000

Hi,

I've upgraded our test cluster to Octopus, and enabled the auto-scaler. 
It's nearly finished:

    PG autoscaler decreasing pool 11 PGs from 1024 to 32 (4d)
      [==========================..] (remaining: 3h)

But I notice it looks to be making pool 11 smaller when HEALTH_WARN 
thinks it should be larger:

root@sto-t1-1:~# ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average; 9 pgs 
not deep-scrubbed in time
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than 
average
    pool default.rgw.buckets.data objects per pg (313153) is more than 
23.4063 times cluster average (13379)

...which seems like the wrong thing for the auto-scaler to be doing. Is 
this a known problem?

Regards,

Matthew

More details:

ceph df:
root@sto-t1-1:~# ceph df
--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
hdd    993 TiB  782 TiB  210 TiB   211 TiB      21.22
TOTAL  993 TiB  782 TiB  210 TiB   211 TiB      21.22

--- POOLS ---
POOL                        ID  STORED   OBJECTS  USED     %USED  MAX AVAIL
.rgw.root                    2   69 KiB        4  1.4 MiB      0    220 TiB
default.rgw.control          3  1.1 MiB        8  3.3 MiB      0    220 TiB
default.rgw.data.root        4  115 KiB       14  3.6 MiB      0    220 TiB
default.rgw.gc               5  5.3 MiB       32   23 MiB      0    220 TiB
default.rgw.log              6   31 MiB      184   96 MiB      0    220 TiB
default.rgw.users.uid        7  249 KiB        8  1.8 MiB      0    220 TiB
default.rgw.buckets.data    11   23 GiB   10.02M  2.0 TiB   0.30    220 TiB
rgwtls                      13   54 KiB        3  843 KiB      0    220 TiB
pilot-metrics               14  285 MiB    2.60M  476 GiB   0.07    220 TiB
pilot-images                15   40 GiB    4.97k  122 GiB   0.02    220 TiB
pilot-volumes               16  192 GiB   48.90k  577 GiB   0.09    220 TiB
pilot-vms                   17  125 GiB   33.79k  376 GiB   0.06    220 TiB
default.rgw.users.keys      18  111 KiB        5  1.5 MiB      0    220 TiB
default.rgw.buckets.index   19  4.0 GiB      246   12 GiB      0    220 TiB
rbd                         20   39 TiB   10.09M  116 TiB  14.88    220 TiB
default.rgw.buckets.non-ec  21  344 KiB        1  1.0 MiB      0    220 TiB
rgw-ec                      22  7.0 TiB    1.93M   11 TiB   1.57    441 TiB
rbd-ec                      23   45 TiB   11.73M   67 TiB   9.22    441 TiB
default.rgw.users.email     24   23 MiB        1   69 MiB      0    220 TiB
pilot-backups               25   73 MiB        3  219 MiB      0    220 TiB
device_health_metrics       26   51 MiB      186  153 MiB      0    220 TiB

root@sto-t1-1:~# ceph osd pool autoscale-status
POOL                          SIZE  TARGET SIZE  RATE  RAW CAPACITY 
RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE
.rgw.root                   70843                 3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.control          1116k                3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.data.root       115.1k                3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.gc               5379k                3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.log             32036k                3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.users.uid       248.7k                3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.buckets.data    23894M                3.0        992.7T 
0.0001                                  1.0      32              on
rgwtls                      55760                 3.0        992.7T 
0.0000                                  1.0      32              on
pilot-metrics               285.3M                3.0        992.7T 
0.0000                                  1.0      32              on
pilot-images                41471M                3.0        992.7T 
0.0001                                  1.0      32              on
pilot-volumes               192.3G                3.0        992.7T 
0.0006                                  1.0      32              on
pilot-vms                   124.6G                3.0        992.7T 
0.0004                                  1.0      32              on
default.rgw.users.keys      111.1k                3.0        992.7T 
0.0000                                  1.0      32              on
default.rgw.buckets.index    4090M                3.0        992.7T 
0.0000                                  1.0      32              on
rbd                         39430G                3.0        992.7T 
0.1164                                  1.0    1024              on
default.rgw.buckets.non-ec  344.3k                3.0        992.7T 
0.0000                                  1.0      32              on
rgw-ec                       7175G                1.5        992.7T 
0.0106                                  1.0      64              on
rbd-ec                      45806G                1.5        992.7T 
0.0676                                  1.0    1024              on
default.rgw.users.email     23530k                3.0        992.7T 
0.0000                                  1.0      32              on
pilot-backups               74699k                3.0        992.7T 
0.0000                                  1.0      32              on
device_health_metrics       52128k                3.0        992.7T 
0.0000                                  1.0      32              on

--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx