Re: Trying to understand what overlapped roots means in pg_autoscale's scale-down mode

"Harry G. Coin" <hgcoin@xxxxxxxxx> · Fri, 1 Oct 2021 11:43:11 -0500

I asked as well, it seems nobody on the list knows so far.

On 9/30/21 10:34 AM, Andrew Gunnerson wrote:
Hello,

I'm trying to figure out what overlapping roots entails with the default
scale-down autoscaling profile in Ceph Pacific. My test setup involves a CRUSH
map that looks like:

     ID=-1  | root=default
     ID=-58 |     rack=rack1
     ID=-70 |         host=ssd-1
            |             <OSDs>
     ID=-61 |     rack=rack2
     ID=-55 |         host=ssd-2
            |             <OSDs>
     ID=-62 |     rack=rack3
     ID=-52 |         host=ssd-3
            |             <OSDs>
     ID=-63 |     rack=rack4
     ID=-19 |         host=hdd-1
            |             <OSDs>
            |         <15 more hosts>

The CRUSH rules I created are:

     # Rack failure domain for SSDs
     ceph osd crush rule create-replicated replicated_ssd default rack ssd
     # Host failure domain for HDDs
     ceph osd crush rule create-replicated replicated_hdd default host hdd
     ceph osd erasure-code-profile set erasure_hdd ruleset k=3 m=2 crush-device-class=hdd crush-failure-domain=host

and the pools are:

     Pool                       | CRUSH rule/profile | Overlapped roots error
     ---------------------------|--------------------|-----------------------
     device_health_metrics      | replicated_rule    | -1 (root=default)
     cephfs_metadata            | replicated_ssd     | -51 (root=default~ssd)
     cephfs_data_replicated_ssd | replicated_ssd     | -51 (root=default~ssd)
     cephfs_data_replicated_hdd | replicated_hdd     | -2 (root=default~hdd)
     cephfs_data_erasure_hdd    | erasure_hdd        | -1 (root=default)

With this setup, the autoscaler is getting disabled in every pool with the
following error:

     [pg_autoscaler WARNING root] pool <num> contains an overlapping root -<id>... skipping scaling

There doesn't seem to be much documentation about overlapped roots. I think I'm
fundamentally not understanding what it means. Does it mean that the autoscaler
can't handle two different pools using OSDs under the same (shadow?) root in the
CRUSH map?

Is this setup something that's not possible using the scale-down autoscaler
profile? It seems that the scale-up profile doesn't have a concept of overlapped
roots.

Thank you,
Andrew
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx