Re: Theory about min_size and its implications

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On 02.03.23 09:16, stefan.pinter@xxxxxxxxxxxxxxxx wrote:

so if one room goes down/offline, around 50% of the PGs would be left with only 1 replica making them read-only.

Most people forget the other half of the cluster in such a scenario.

For us humans it is obvious that one room is down, because we can see it from the outside.

The OSDs only see that they do not have connectivity to their peering partners. They do not know if this is because the other hosts are down or just the network in between.

It could be the case that just the line between both rooms is dead and then you have 2 copies running in one room and only one in the other. If you now allow changes in the "smaller" room in addition to changes in the room with two copies you immediately get a conflict as soon as the network connection between both rooms is reestablished.

This is why min_size=1 is a really bad idea outside of a desaster scenario where the other two copies are completely lost to a fire.

Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]

  Powered by Linux