Hi,
I've upgraded our test cluster to Octopus, and enabled the auto-scaler.
It's nearly finished:
PG autoscaler decreasing pool 11 PGs from 1024 to 32 (4d)
[==========================..] (remaining: 3h)
But I notice it looks to be making pool 11 smaller when HEALTH_WARN
thinks it should be larger:
root@sto-t1-1:~# ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average; 9 pgs
not deep-scrubbed in time
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than
average
pool default.rgw.buckets.data objects per pg (313153) is more than
23.4063 times cluster average (13379)
...which seems like the wrong thing for the auto-scaler to be doing. Is
this a known problem?
Regards,
Matthew
More details:
ceph df:
root@sto-t1-1:~# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 993 TiB 782 TiB 210 TiB 211 TiB 21.22
TOTAL 993 TiB 782 TiB 210 TiB 211 TiB 21.22
--- POOLS ---
POOL ID STORED OBJECTS USED %USED MAX AVAIL
.rgw.root 2 69 KiB 4 1.4 MiB 0 220 TiB
default.rgw.control 3 1.1 MiB 8 3.3 MiB 0 220 TiB
default.rgw.data.root 4 115 KiB 14 3.6 MiB 0 220 TiB
default.rgw.gc 5 5.3 MiB 32 23 MiB 0 220 TiB
default.rgw.log 6 31 MiB 184 96 MiB 0 220 TiB
default.rgw.users.uid 7 249 KiB 8 1.8 MiB 0 220 TiB
default.rgw.buckets.data 11 23 GiB 10.02M 2.0 TiB 0.30 220 TiB
rgwtls 13 54 KiB 3 843 KiB 0 220 TiB
pilot-metrics 14 285 MiB 2.60M 476 GiB 0.07 220 TiB
pilot-images 15 40 GiB 4.97k 122 GiB 0.02 220 TiB
pilot-volumes 16 192 GiB 48.90k 577 GiB 0.09 220 TiB
pilot-vms 17 125 GiB 33.79k 376 GiB 0.06 220 TiB
default.rgw.users.keys 18 111 KiB 5 1.5 MiB 0 220 TiB
default.rgw.buckets.index 19 4.0 GiB 246 12 GiB 0 220 TiB
rbd 20 39 TiB 10.09M 116 TiB 14.88 220 TiB
default.rgw.buckets.non-ec 21 344 KiB 1 1.0 MiB 0 220 TiB
rgw-ec 22 7.0 TiB 1.93M 11 TiB 1.57 441 TiB
rbd-ec 23 45 TiB 11.73M 67 TiB 9.22 441 TiB
default.rgw.users.email 24 23 MiB 1 69 MiB 0 220 TiB
pilot-backups 25 73 MiB 3 219 MiB 0 220 TiB
device_health_metrics 26 51 MiB 186 153 MiB 0 220 TiB
root@sto-t1-1:~# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY
RATIO TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
.rgw.root 70843 3.0 992.7T
0.0000 1.0 32 on
default.rgw.control 1116k 3.0 992.7T
0.0000 1.0 32 on
default.rgw.data.root 115.1k 3.0 992.7T
0.0000 1.0 32 on
default.rgw.gc 5379k 3.0 992.7T
0.0000 1.0 32 on
default.rgw.log 32036k 3.0 992.7T
0.0000 1.0 32 on
default.rgw.users.uid 248.7k 3.0 992.7T
0.0000 1.0 32 on
default.rgw.buckets.data 23894M 3.0 992.7T
0.0001 1.0 32 on
rgwtls 55760 3.0 992.7T
0.0000 1.0 32 on
pilot-metrics 285.3M 3.0 992.7T
0.0000 1.0 32 on
pilot-images 41471M 3.0 992.7T
0.0001 1.0 32 on
pilot-volumes 192.3G 3.0 992.7T
0.0006 1.0 32 on
pilot-vms 124.6G 3.0 992.7T
0.0004 1.0 32 on
default.rgw.users.keys 111.1k 3.0 992.7T
0.0000 1.0 32 on
default.rgw.buckets.index 4090M 3.0 992.7T
0.0000 1.0 32 on
rbd 39430G 3.0 992.7T
0.1164 1.0 1024 on
default.rgw.buckets.non-ec 344.3k 3.0 992.7T
0.0000 1.0 32 on
rgw-ec 7175G 1.5 992.7T
0.0106 1.0 64 on
rbd-ec 45806G 1.5 992.7T
0.0676 1.0 1024 on
default.rgw.users.email 23530k 3.0 992.7T
0.0000 1.0 32 on
pilot-backups 74699k 3.0 992.7T
0.0000 1.0 32 on
device_health_metrics 52128k 3.0 992.7T
0.0000 1.0 32 on
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx