Re: Overlapping Roots - How to Fix?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23-09-2024 16:31, Janne Johansson wrote:
Den mån 23 sep. 2024 kl 16:23 skrev Stefan Kooman <stefan@xxxxxx>:

On 23-09-2024 16:04, Dave Hall wrote:
Thank you to everybody who has responded to my questions.

At this point I think I am starting to understand.  However, I am still
trying to understand the potential for data loss.

In particular:

     - In some ways it seems that as long as there is sufficient OSD capacity
     available the worst that can happen from a bad crush map is poor placement
     and poor performance. Is this correct?

If you would have a (new) crush rule without any OSD mappings all PGs
for pools that use that rule would go in an inactive state, i.e.
downtime. So when you create a (new) rule you would have to check that
CRUSH can indeed find enough OSDs to comply with the policy you defined.

Are you sure? I have asked some pools to use an "impossible" crush
rule after creation and the PGs only end up as "misplaced".

Apparently it depends ... You are right with regard to the newly created pools and inactive state (at least that's what I have seen in all cases). If I have a pool use a crush rule that does not have any OSDs (nvme class rule without OSDs with nvme device class) the PGs become "unknown" (not state inactive like I said). But IO for that pool does not work at that point (both acting and up sets have "[]p-1", i.e. no OSDs available).

However, if I switch back to a valid rule, and then back again to the invalid rule, the PGs become "active+clean+remapped" and IO does work (up set is: []p-1 but acting set is previously mapped OSDs). That is probably the same state you have seen in your cluster (and I have seen in Reef clusters as well). My tests have been performed on a 16.2.11 test cluster.

At
creation they might stay inactive until a good place for them can be
found, but then you can't write data to it so it is not a "data-loss"
scenario really if the pool never started.

Correct, no data loss in that situation.

Gr. Stefan




_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux