Re: Worst thing that can happen if I have size= 2

Mario Giammarco <mgiammarco@xxxxxxxxx> · Thu, 4 Feb 2021 11:23:45 +0100

Il giorno gio 4 feb 2021 alle ore 00:33 Simon Ironside <
sironside@xxxxxxxxxxxxx> ha scritto:

>
>
> On 03/02/2021 19:48, Mario Giammarco wrote:
>
> To labour Dan's point a bit further, maybe a RAID5/6 analogy is better
> than RAID1. Yes, I know we're not talking erasure coding pools here but
> this is similar to the reasons why people moved from RAID5 (size=2, kind
> of) to RAID6 (size=3, kind of). I.e. the more disks you have in an array
> (cluster, in our case) and the bigger those disks are, the greater the
> chance you have of encountering a second problem during a recovery.
>
> Yes I know the motivations for raid6 but to simplify  the use case I am
comparing ceph size=2 to raid1.

> > What I ask is this: what happens with min_size=1 and split brain,
> > network down or similar things: do ceph block writes because it has no
> > quorum on monitors? Are there some failure scenarios that I have not
> > considered?
>
> It sounds like in your example you would have 3 physical servers in
> total. So would you have both a monitor and OSDs processes on each server?
>
>
Yes sorry if it was not clear:
- three servers
- three monitors
- three managers
- 6 osd (two disks per server)

> If so, it's not really related to min_size=1 but to answer your question
> you could lose one monitor and the cluster would continue. Losing a
> second monitor will stop your cluster until this is resolved. In your
> example setup (with colocated mons & OSDs) this would presumably also
> mean you'd lost two OSDs servers too so you'd have bigger problems.
>
>
Losing the switch means monitors are up but cannot communicate so they
should stop?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx