Hi,
I only have one remark on your assumption regarding maintenance with
your current setup. With your profile k4 m2 you'd have a min_size of 5
(k + 1 which is recommended), taking one host down would still result
in IO pause because min_size is not met. To allow IO you'd need to
reduce min_size to 4 which is only recommended in disaster scenarios.
With three nodes you'd be better off with replication size 3, although
it requires more storage, of course.
Adding (or removing) OSDs always results in remapping, I don't think
it's unexpected what you're describing.
Regards,
Eugen
Zitat von aschmitz <ceph-users@xxxxxxxxxxxx>:
Hi folks,
I have a small cluster of three Ceph hosts running on Pacific. I'm
trying to balance resilience and disk usage, so I've set up a k=4
m=2 pool for some bulk storage on HDD devices.
With the correct placement of PGs this should allow me to take any
one host offline for maintenance. I've written this CRUSH rule for
that purpose:
rule erasure_k4_m2_hdd_rule {
id 3
type erasure
step take default class hdd
step choose indep 3 type host
step chooseleaf indep 2 type osd
step emit
}
This should pick three hosts, and then two OSDs from each, which at
least ensures that no host has more than two OSDs.
This appears to work correctly, but I'm running into an odd
situation when adding additional OSDs to the cluster: sometimes the
hosts flip order in a PG's set, resulting in unnecessary remapping
work.
For example, I have one PG that changed from OSDs [0,13,7,9,3,5] to
[0,13,3,5,7,9]. (Note that the middle two and last two sets of OSDs
have swapped with one another.) From a quick perusal of other PGs
that are being moved, the two OSDs within a host never appear to be
rearranged, but the set of hosts that are chosen may be shuffled.
Is there something I'm missing that would make this rule more stable
in the face of OSD addition? (I'm wondering if the host choosing
component should be "firstn" rather than "indep", even though the
discussion at
https://docs.ceph.com/en/latest/rados/operations/crush-map-edits/#crushmaprules implies indep is preferable in EC
pools.)
I don't have current plans to expand beyond a three-host cluster,
but if there's an alternative way to express "not more than two OSDs
per host", that could be helpful as well.
Any insights or suggestions would be appreciated.
Thanks,
aschmitz
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx