Re: Trying to understand what overlapped roots means in pg_autoscale's scale-down mode

Gregory Farnum <gfarnum@xxxxxxxxxx> · Fri, 1 Oct 2021 12:29:39 -0700

On Fri, Oct 1, 2021 at 11:55 AM Andrew Gunnerson
<accounts.ceph@xxxxxxxxxxxx> wrote:
>
> Thanks for the info. Do shadow roots affect that calculation at all?

They're not supposed to, but I haven't worked in this code...but the
different names you get (with the ~ssd and ~hdd postfix) would
indicate not.

>
> In the regular "ceph osd crush tree", I don't see any cycles and it doesn't seem
> like there are the same buckets under two roots (I only have one root):
>
>     root=default
>         rack=rack1
>             host=ssd-1
>                 <OSDs>
>         rack=rack2
>             host=ssd-2
>                 <OSDs>
>         rack=rack3
>             host=ssd-3
>                 <OSDs>
>         rack=rack4
>             host=hdd-1
>                 <OSDs>
>
> If I use `--show-shadow` and pick `host=ssd-1` as an example, I see:
>
>     root=default~ssd
>         rack=rack1~ssd
>             host=ssd-1~ssd
>                 <OSDs>
>     root=default~hdd
>         rack=rack1~hdd
>             host=ssd-1~hdd
>                 <Nothing/no OSDs>
>
> Are (eg.) `host=ssd-1~ssd` and `host=ssd-1~hdd` treated as the same bucket?
>
> On Fri, Oct 1, 2021, at 14:27, Gregory Farnum wrote:
> > It generally means, in CS terms, that you have a graph rather than a tree.
> >
> > In other words, you have two roots, or other crush buckets, which
> > contain some of the same buckets/items underneath themselves.
> >
> > On Fri, Oct 1, 2021 at 9:43 AM Harry G. Coin <hgcoin@xxxxxxxxx> wrote:
> >>
> >> I asked as well, it seems nobody on the list knows so far.
> >>
> >>
> >> On 9/30/21 10:34 AM, Andrew Gunnerson wrote:
> >> > Hello,
> >> >
> >> > I'm trying to figure out what overlapping roots entails with the default
> >> > scale-down autoscaling profile in Ceph Pacific. My test setup involves a CRUSH
> >> > map that looks like:
> >> >
> >> >      ID=-1  | root=default
> >> >      ID=-58 |     rack=rack1
> >> >      ID=-70 |         host=ssd-1
> >> >             |             <OSDs>
> >> >      ID=-61 |     rack=rack2
> >> >      ID=-55 |         host=ssd-2
> >> >             |             <OSDs>
> >> >      ID=-62 |     rack=rack3
> >> >      ID=-52 |         host=ssd-3
> >> >             |             <OSDs>
> >> >      ID=-63 |     rack=rack4
> >> >      ID=-19 |         host=hdd-1
> >> >             |             <OSDs>
> >> >             |         <15 more hosts>
> >> >
> >> > The CRUSH rules I created are:
> >> >
> >> >      # Rack failure domain for SSDs
> >> >      ceph osd crush rule create-replicated replicated_ssd default rack ssd
> >> >      # Host failure domain for HDDs
> >> >      ceph osd crush rule create-replicated replicated_hdd default host hdd
> >> >      ceph osd erasure-code-profile set erasure_hdd ruleset k=3 m=2 crush-device-class=hdd crush-failure-domain=host
> >> >
> >> > and the pools are:
> >> >
> >> >      Pool                       | CRUSH rule/profile | Overlapped roots error
> >> >      ---------------------------|--------------------|-----------------------
> >> >      device_health_metrics      | replicated_rule    | -1 (root=default)
> >> >      cephfs_metadata            | replicated_ssd     | -51 (root=default~ssd)
> >> >      cephfs_data_replicated_ssd | replicated_ssd     | -51 (root=default~ssd)
> >> >      cephfs_data_replicated_hdd | replicated_hdd     | -2 (root=default~hdd)
> >> >      cephfs_data_erasure_hdd    | erasure_hdd        | -1 (root=default)
> >> >
> >> > With this setup, the autoscaler is getting disabled in every pool with the
> >> > following error:
> >> >
> >> >      [pg_autoscaler WARNING root] pool <num> contains an overlapping root -<id>... skipping scaling
> >> >
> >> > There doesn't seem to be much documentation about overlapped roots. I think I'm
> >> > fundamentally not understanding what it means. Does it mean that the autoscaler
> >> > can't handle two different pools using OSDs under the same (shadow?) root in the
> >> > CRUSH map?
> >> >
> >> > Is this setup something that's not possible using the scale-down autoscaler
> >> > profile? It seems that the scale-up profile doesn't have a concept of overlapped
> >> > roots.
> >> >
> >> > Thank you,
> >> > Andrew
> >> > _______________________________________________
> >> > ceph-users mailing list -- ceph-users@xxxxxxx
> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx