Hi folks,
I have a small cluster of three Ceph hosts running on Pacific. I'm
trying to balance resilience and disk usage, so I've set up a k=4 m=2
pool for some bulk storage on HDD devices.
With the correct placement of PGs this should allow me to take any one
host offline for maintenance. I've written this CRUSH rule for that purpose:
rule erasure_k4_m2_hdd_rule {
id 3
type erasure
step take default class hdd
step choose indep 3 type host
step chooseleaf indep 2 type osd
step emit
}
This should pick three hosts, and then two OSDs from each, which at
least ensures that no host has more than two OSDs.
This appears to work correctly, but I'm running into an odd situation
when adding additional OSDs to the cluster: sometimes the hosts flip
order in a PG's set, resulting in unnecessary remapping work.
For example, I have one PG that changed from OSDs [0,13,7,9,3,5] to
[0,13,3,5,7,9]. (Note that the middle two and last two sets of OSDs have
swapped with one another.) From a quick perusal of other PGs that are
being moved, the two OSDs within a host never appear to be rearranged,
but the set of hosts that are chosen may be shuffled.
Is there something I'm missing that would make this rule more stable in
the face of OSD addition? (I'm wondering if the host choosing component
should be "firstn" rather than "indep", even though the discussion at
https://docs.ceph.com/en/latest/rados/operations/crush-map-edits/#crushmaprules
implies indep is preferable in EC pools.)
I don't have current plans to expand beyond a three-host cluster, but if
there's an alternative way to express "not more than two OSDs per host",
that could be helpful as well.
Any insights or suggestions would be appreciated.
Thanks,
aschmitz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx