Re: [ceph-users] Bug with autoscale-status in 17.2.0 ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jake,

this looks similar to something I ran into lately with version 16.2.7

In my case, the active managers log showed something about (non) overlapping roots, so I adjusted the crush rules. For me it was quite simple, because the cluster was all SSDs,
but the default replicated_rule was never set to class ssd.

What I did, was setting the correct device class.

A little background on that:
  the default root usually has the id -1
  but when choosing device class ssd in the crush rule, CRUSH uses the root default~ssd
  this root has a different id. In my case it was -2

Possibly your pool `.mgr` uses a crush rule, which doesn't specify the device class but
uses some OSDs covered by another crush rule in use.


Best regards

On 6/10/22 1:14 PM, Jake Grimmett wrote:
Dear All,

We are testing Quincy on a new large cluster, "ceph osd pool autoscale-status" fails if we add a pool that uses a custom crush rule using a specific device class, but it's fine if we don't specify the class:

[root@wilma-s1 ~]# ceph -v
ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy (stable)

[root@wilma-s1 ~]# cat /etc/redhat-release
AlmaLinux release 8.6 (Sky Tiger)

[root@wilma-s1 ~]# ceph osd pool autoscale-status
POOL        SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK .mgr       6980k                3.0         7200T  0.0000                   1.0       1              on         False ec82pool      0                1.25         7200T  0.0000                   1.0    1024          32  off        False


[root@wilma-s1 ~]# ceph osd crush rule create-replicated ssd_replicated default host ssd
[root@wilma-s1 ~]# ceph osd pool create mds_ssd 32 32 ssd_replicated
pool 'mds_ssd' created
[root@wilma-s1 ~]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    7.0 PiB  6.9 PiB  126 TiB   126 TiB       1.75
ssd    2.7 TiB  2.7 TiB  3.1 GiB   3.1 GiB       0.11
TOTAL  7.0 PiB  6.9 PiB  126 TiB   126 TiB       1.75

--- POOLS ---
POOL      ID   PGS   STORED  OBJECTS    USED  %USED  MAX AVAIL
.mgr       4     1  6.8 MiB        3  20 MiB      0    2.2 PiB
ec82pool   8  1024      0 B        0     0 B      0    5.2 PiB
mds_ssd   13    32      0 B        0     0 B      0    884 GiB

[root@wilma-s1 ~]# ceph osd pool autoscale-status
(exits with no output)
[root@wilma-s1 ~]# ceph osd pool delete mds_ssd mds_ssd --yes-i-really-really-mean-it
pool 'mds_ssd' removed
[root@wilma-s1 ~]# ceph osd pool autoscale-status
POOL        SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK .mgr       6980k                3.0         7200T  0.0000                   1.0       1              on         False ec82pool      0                1.25         7200T  0.0000                   1.0    1024          32  off        False

Any ideas on what might be going on?

We get a similar problem if we specify hdd as the class.

best regards

Jake

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux