Theory about min_size and its implications

stefan.pinter@xxxxxxxxxxxxxxxx · Thu, 02 Mar 2023 08:16:06 -0000

Hi!

it is unclear for us what min_size means besides what it does. i hope someone can clear this up :)

scenario:
size is 3 and min_size is 2
2 rooms with 100 OSDs each and this crush rule

                "op": "take",
                "item": -10,
                "item_name": "default"

                "op": "choose_firstn",
                "num": 2,
                "type": "room"

                "op": "chooseleaf_firstn",
                "num": 2,
                "type": "host"

                "op": "emit"

so if one room goes down/offline, around 50% of the PGs would be left with only 1 replica making them read-only.
if we'd set min_size to 1 and one room goes down, user wouldn't still be able to access all PGs - but what is the problem with only one active PG?
someone pointed out "split brain" but I am unsure about this. 

i think what happens in the worst case is this:
only 1 PG is available, client writes changes to this PG, the disk of this 1 PG dies as well - so i guess we'd need to restore the data from the 2 offline PGs in the room that is down and we would have lots of trouble with restoring and also with data inconsistency, right?

thank you!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx