On 23-09-2024 16:04, Dave Hall wrote:
Thank you to everybody who has responded to my questions.
At this point I think I am starting to understand. However, I am still
trying to understand the potential for data loss.
In particular:
- In some ways it seems that as long as there is sufficient OSD capacity
available the worst that can happen from a bad crush map is poor placement
and poor performance. Is this correct?
If you would have a (new) crush rule without any OSD mappings all PGs
for pools that use that rule would go in an inactive state, i.e.
downtime. So when you create a (new) rule you would have to check that
CRUSH can indeed find enough OSDs to comply with the policy you defined.
- crushtool --compare - if the result of this command shows no
mismatches, can we say that the adjusted crush map is safe to apply?
Do you mean there are no differences? Then yes.
- If all of the 'inhibit flags' are turned on (no out, no down, no
scrub/deep-scrub, no recover/rebalance/backfill, and perhaps pause) is it
safe to apply an adjusted crush map?
It should always be "safe". It just depends on the kind of CRUSHmap you
are injecting (crush rules without any valid mappings) that might
"break" things. Ceph will handle that and inform you about it, but you
might run into downtime when "wrong" CRUSHmap is injected. So you should
always be careful and test the CRUSHmap beforehand.
Is it safe to revert to the original
crush map if things don't seem quite right?
Yes. That's _exactly_ the think you should do when things do not go
according to plan. So make sure you have a working CRUSHmap (backup)
that you can fall back to (ceph osd getcrushmap -o /tmp/crush_raw).
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx