Re: Monitor issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/30/2012 06:06 AM, Roman Alekseev wrote:
> On 29.10.2012 18:59, Wido den Hollander wrote:
>>
>>
>> On 10/29/2012 03:48 PM, Roman Alekseev wrote:
>>> Hello,
>>>
>>> I have 3 monitors on different nodes and when 'mon.a' was stopped whole
>>> cluster stopped work too.
>>> My conf: http://pastebin.com/hT3qEhUF
>>>
>>> Could someone explain how to fix such kind of failure?
>>
>> Could you explain a bit more about the setup?
>>
>> Which version are you running?
>>
>> What do you mean with failure? Is the ceph -s command still working?
>>
>> How sure are you that you didn't catch a bug that killed all three
>> monitors? Are those processes actually up and running?
>>
>> Did you check the logs of the monitors?
>>
>> Could you let us know?
>>
>> Thanks!
>>
>> Wido
> 
> Hi Wido,
> 
> I'm running ceph version 0.48.1argonaut.
> The "ceph -s" command doesn't work until I start that monitor again.
> Under failure I mean that ceph commands (such as ceph -s , -w, ceph mon
> dump etc) don't respond. I've re-added all three mons and found the
> following situations:
> 
> Situation A:
> 1) mon.a is disabled:
> health HEALTH_WARN 1 mons down, quorum 1,2 b,c (cluster works)
> 2) mon.b is disabled:
> health HEALTH_WARN 1 mons down, quorum 0,1 a,c (cluster works)
> 3) mon.c is disabled:
> health HEALTH_WARN 1 mons down, quorum 0,2 a,b (cluster works)
> 
> Situation B:
> If 2 mons are disabled all cluster stop working.
> So cluster works only when 2 monitors are running.
> 
> Is it correct ?
> 

In a nutshell, yes.

You need to have a majority of monitors up and running, and in the
quorum, for the cluster to work.

So, in 'A', you always have 2 out of the 3 existing monitors up; this
means a majority is up and quorum can be formed.

In 'B' however, you have only one monitor. Given there are 3 monitors in
the monmap, you don't have enough monitors to form a quorum (N+1/2, N
being the total number of monitors in your cluster, which is 3 in your
case).

Same rules apply if you had, say, 5 monitors: you'd need 3 up for the
cluster to work; if you only had 2, the monitors wouldn't be able to
form quorum.

And by the way, the cluster doesn't work because unless you have a
healthy monitor cluster, with formed quorum, the monitors that are up
and running will basically refuse to answer any kind of requests made to
them (aside from specific commands made through the admin socket, that
are targeted at one specific monitor).

Hope this helps.

  -Joao
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux