Hi Jake,
this looks similar to something I ran into lately with version 16.2.7
In my case, the active managers log showed something about (non)
overlapping roots, so I
adjusted the crush rules. For me it was quite simple, because the
cluster was all SSDs,
but the default replicated_rule was never set to class ssd.
What I did, was setting the correct device class.
A little background on that:
the default root usually has the id -1
but when choosing device class ssd in the crush rule, CRUSH uses the
root default~ssd
this root has a different id. In my case it was -2
Possibly your pool `.mgr` uses a crush rule, which doesn't specify the
device class but
uses some OSDs covered by another crush rule in use.
Best regards
On 6/10/22 1:14 PM, Jake Grimmett wrote:
Dear All,
We are testing Quincy on a new large cluster, "ceph osd pool
autoscale-status" fails if we add a pool that uses a custom crush rule
using a specific device class, but it's fine if we don't specify the
class:
[root@wilma-s1 ~]# ceph -v
ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy
(stable)
[root@wilma-s1 ~]# cat /etc/redhat-release
AlmaLinux release 8.6 (Sky Tiger)
[root@wilma-s1 ~]# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET
RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK
.mgr 6980k 3.0 7200T 0.0000
1.0 1 on False
ec82pool 0 1.25 7200T 0.0000
1.0 1024 32 off False
[root@wilma-s1 ~]# ceph osd crush rule create-replicated
ssd_replicated default host ssd
[root@wilma-s1 ~]# ceph osd pool create mds_ssd 32 32 ssd_replicated
pool 'mds_ssd' created
[root@wilma-s1 ~]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 7.0 PiB 6.9 PiB 126 TiB 126 TiB 1.75
ssd 2.7 TiB 2.7 TiB 3.1 GiB 3.1 GiB 0.11
TOTAL 7.0 PiB 6.9 PiB 126 TiB 126 TiB 1.75
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 4 1 6.8 MiB 3 20 MiB 0 2.2 PiB
ec82pool 8 1024 0 B 0 0 B 0 5.2 PiB
mds_ssd 13 32 0 B 0 0 B 0 884 GiB
[root@wilma-s1 ~]# ceph osd pool autoscale-status
(exits with no output)
[root@wilma-s1 ~]# ceph osd pool delete mds_ssd mds_ssd
--yes-i-really-really-mean-it
pool 'mds_ssd' removed
[root@wilma-s1 ~]# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET
RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULK
.mgr 6980k 3.0 7200T 0.0000
1.0 1 on False
ec82pool 0 1.25 7200T 0.0000
1.0 1024 32 off False
Any ideas on what might be going on?
We get a similar problem if we specify hdd as the class.
best regards
Jake
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx