Re: mon woes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I don't like to promote my own website on the mailing but this article
tries to address your problem with a possible answer :-)

http://www.sebastien-han.fr/blog/2013/01/28/ceph-geo-replication-sort-of/
--
Regards,
Sébastien Han.


On Thu, Feb 14, 2013 at 2:03 PM, Joao Eduardo Luis
<joao.luis@xxxxxxxxxxx> wrote:
> On 02/14/2013 12:24 PM, Simon Leinen wrote:
>>
>> Wolfgang Hennerbichler writes:
>>>
>>> I have a ceph cluster on 2 sites. One site has 2 mons, the other site
>>> has 1 mon. [...]
>>
>>
>> As Martin wrote, if you lose the site with the 2 mons, the entire
>> cluster will become unavailable.
>>
>> Here's what I've been thinking to myself could be a nice solution:
>>
>> Get a third site somewhere, and move one of the currently 2 mons from
>> your first site to that third site.  The site only needs space and
>> performance for one (1) VM running ceph-mon.  Ideally it would be
>> reliable, well-connected in terms of RTT etc. - but even if it isn't,
>> that may not matter so much.
>>
>> My reasoning is that under normal conditions, the two mons in your
>> "real" sites will be sufficient (quorum) to maintain consistency of the
>> cluster.  So even if the third-site mon is somehow "asleep at the
>> wheel", that wouldn't necessarily have any noticeable impact on your
>> cluster's performance.  (That's pure hypothesis, I haven't tried this or
>> otherwise thought this through.  Please comment if you disagree!)
>
>
> This could actually cause problems to the quorum.  If one of the monitors
> doesn't perform well enough, or if it gets overloaded, you may start seeing
> it behaving weirdly (constantly drop out of quorum for instance), and that
> might even affect the other two monitors (dropping out of quorum usually
> leads to a subsequent attempt to join it, and that will trigger an
> election).
>
> The biggest problem is still regarding latency.  Given that the Paxos will
> take into consideration all the monitors in the quorum for each round, if
> the RTT on one of the monitors is high enough, the Paxos will start to
> timeout.  Timeouts make a monitor bootstrap, new election, etc.  If one of
> the monitors have a chronically high latency, then it may render the cluster
> unusable.
>
> It is however possible to adjust those timeouts, but have never tried to do
> it as to deal with such a scenario.  It ought to do it, but am not sure what
> could be the potential consequences (if any) of doing so.
>
>
>> I guess you can hardwire the ranks of the mons to make sure that the
>> third-site monitor never becomes elected as leader.
>
>
> ranks are assigned on ip:port.  If you have some freedom when assigning ips,
> then it should be fairly straightforward (127.0.0.1:6789 < 127.0.0.2:6789 <
> 127.0.0.2:6790)
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux