Re: Corosync documentation / API ? (duplicate?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matan,

Matan Shukry napsal(a):
> Well this is both correct and incorrect.
> 
> Zookeeper can handle the "small" amount of nodes for an ensemble (servers).
> However it can handle 1000~ clients, where each one can write/read data.
> 

Ok.

> Is there a similar analogy in corosync? So that if needed, I can deploy
> more nodes as needed, and they will be able to negotiate together using cgp?
> 

Actually no. Corosync itself (so totemsrp, ... + CPG service) can handle
quite a lot clients (CPG client) without any problem, but there is no
way how to make running CPG client on node where corosync is not
running. Theoretically, it shouldn't be so big problem to implement such
functionality, because CPG client communicates with CPG server via libqb
IPC, so only missing part is to make libqb IPC network enabled
(currently only /dev/shm file and unix sockets are implemented).

> Is a "depth" cluster possible?
> e.g.
> Node 1-5 in corosync cluster, each "owning" 20% of data, and another 5
> clusters of N (20? .) nodes each, ending up with 100 nodes, each handling
> 1% of data?
> If so, how scalable is this approach?


Regards,
  Honza

> On Jan 7, 2014 12:58 PM, "Jan Friesse" <jfriesse@xxxxxxxxxx> wrote:
> 
>> Matan,
>>
>> Matan Shukry napsal(a):
>>> Wow, first let me thank you for the lng answer - thanks!
>>>
>>> You provided a lot of viable information. However I have a few followup
>>> questions:
>>>
>>> 1. About corosync/zookeeper differences; Is corosync, like zookeeper,
>> able
>>> to handle many (100, 1000, more, and possibly limitless ) machines? Or
>> will
>>
>> No. Practical limit is 64 nodes. And honestly, I'm really unsure if
>> Zookeeper is able to handle 1000 nodes. I mean, to create system where
>> you have such number of nodes with all properties of totally ordered
>> reliable message delivery would be real overkill. At least it would take
>> ages to transfer data to all nodes, where all of them must get message
>> and they must send some kind of ack.
>>
>>
>>> it survive only on small number of nodes? Are there any real example of
>> big
>>> clusters? Which is the biggest?
>>>
>>> 2. Is corosync scalable, in a scale-out manner? Will adding nodes lower
>> the
>>> resource requirements of Corosync (network? Cpu? ...), or only "bigger"
>>> machines?
>>>
>>
>> Network load should grow less then linearly if using mcast. Cpu
>> shouldn't be affected. So no, corosync is not scalable in scale-out manner.
>>
>>
>>> 3. Regarding quorum. Is there an option to run without quorum? That is,
>> to
>>> me, even if N-1 nodes failed, I still want the last node to run. Quorum
>>> seems useless in such a case. To me, anyway.
>>>
>>
>> Yes. Corosync by default runs without quorum and as I said in previous
>> email, it's perfectly possible to ends up with N 1-node clusters. Also
>> with last_man_standing feature of quorum, you can have quorum degrading
>> to one node.
>>
>>> And again, thanks alot for all the information!
>>>
>>
>> Regards,
>>   Honza
>>
>>> Yours,
>>> Matan Shukry.
>>> On Jan 6, 2014 12:35 PM, "Jan Friesse" <jfriesse@xxxxxxxxxx> wrote:
>>>
>>>> Christine Caulfield napsal(a):
>>>>> On 02/01/14 19:39, Matan Shukry wrote:
>>>>>> (Steven Dake)
>>>>>> I have to say I'm not afraid (and even prefer a bit) API, and I am
>>>>>> interested in as much availability as I can get, so corosync is what
>> I'm
>>>>>> looking for. Good to know though :)
>>>>
>>>> That's good.
>>>>
>>>>>>
>>>>>> (Christine Caulfield)
>>>>>> As far as I can tell, pacemaker (and products alike) are designed to
>>>>>> deliver high availability to services. Although my application will
>>>>>> become a service at the end,
>>>>>> I also need a way for the services, running on different machines (or
>>>>>> not; might be multiple processes on the same, probably in debug/test
>>>>>> environments),
>>>>>> to talk with each other.
>>>>>>
>>>>>> That is, I can understand how pacemaker can replace corosync 'sam'
>>>>>> (leaving aside the efficiency of each one), however, I don't see how
>>>>>> pacemaker will be able
>>>>>> to replace CPG.
>>>>>> I will say this though:
>>>>>>
>>>>>> 1. The use of CPG between the services, is not directly related to
>>>>>> availability. They need to talk when ever a process goes up/down, so
>>>>
>>>> This is exactly where CPG does very good job.
>>>>
>>>>>> that one process can
>>>>>>    'take over' the data owned by the failed process(should be as fast
>> as
>>>>>> possible), or 'share' more data to a new process(this can have a bit
>> of
>>>>
>>>> This is also achievable but keep in mind that CPG is more or less
>>>> stateless. It can exchange messages and they are delivered atomically,
>>>> but it's not storing any data.
>>>>
>>>>>> delay, if needed).
>>>>>>
>>>>>>    In order to balance the cluster load, when a new process 'goes up',
>>>>>> it should take smaller pieces of data from each node, rather than a
>> big
>>>>>> piece
>>>>>>    from one node, Which is why the new process require talking to all
>>>>>> other processes.
>>>>>>
>>>>>>    When failing the requirement is the same, although this may happen
>>>>>> over time to decrease the downtime of an 'unowned data'.
>>>>>>
>>>>>> 2. I am not 100% sure the CPG protocol (Totem Single Ring Ordering and
>>>>>> Membership Protocol, as far as I know) is the best
>>>>>>    fit for such case. Then again, I am not 100% sure how the protocol
>>>>>> works. from a brief overview of the protocol,
>>>>>>    it seems it match the need requirement in (1).
>>>>>>    However I would love to hear someone else's opinion on the matter.
>>>>>>
>>>>>> 3. Lately I have been messing around with hadoop, and it seems my
>> 'data
>>>>>> sharing' requirement also exists in mapredce/hdfs(although with less
>>>> time
>>>>>>      requirement, I think), and seems to be achieved using ZooKeeper.
>>>>>>      I was wondering what are the main differences between the
>>>>>> ZooKeeper/Corosync, specifically about performance.
>>>>>>      I did try to 'google' the subject a bit, although most answers
>> were
>>>>>> just in the style of 'they look the same', which
>>>>>>      as far as I can tell even the goals of each project are different
>>>>>> (key value store vs high availability? ...)
>>>>
>>>> Probably biggest difference between Zookeeper and Corosync is
>>>> membership/quorum. Zookeeper is set of predefined number of nodes, where
>>>> quorum is simple majority. corosync is fully dynamic membership (+
>>>> provides dynamic quorum), where it's perfectly possible to end with N
>>>> one-node clusters (and application developer must deal with that).
>>>>
>>>>>>
>>>>>> 4. I did eventually found the man pages for the cpg api (after seeing
>>>>>> your comment). I just had to search
>>>>>>      for cpg rather than corosync.
>>>>>>
>>>>>> 5. I looked at the source tree for examples, however all I could find
>>>>>> were tests. Even though the tests may do the basic
>>>>>>      connection/messages/etc (Not completely sure about it by the
>> way),
>>>>>> it is not explained nor easy to read what is there.
>>>>
>>>> test/testcpg.c is actually good example where almost whole CPG API is
>>>> used. CPG itself is not able to do much more.
>>>>
>>>> Of course best example is always source code of real projects using
>>>> Corosync, so Pacemaker, qpid, asterisk, cmirror, ...
>>>>
>>>>
>>>>>>      I could not find any example, even with simple good comments.
>>>>>>      Is there any chance you can link me to an existing example in the
>>>>>> source tree?
>>>>>>
>>>>>> 6. You also said there is 'low' user-level documentation. By any
>> chance
>>>>>> you know a simple tutorial on setting up CPG,
>>>>>>      hopefully including sending/receiving messages and anything
>>>>>> related, and may link it?
>>>>>>
>>>>
>>>> Let me explain you few details. Corosync itself is implementation of
>>>> Totem protocol (originally reliable total ordered multicast with EVS
>>>> properties) + fragmentation layer + RRP (Redundant ring protocol) +
>>>> services. One of service is CPG.
>>>>
>>>> In other words, you are not setting CPG but Corosync. If you have
>>>> correctly setup corosync (see corosync.conf), just exec testcpg, play
>>>> with shutting down nodes, ... to see how other nodes will react.
>>>>
>>>> Other services is probably not very interesting for your usecase.
>>>>
>>>> Honza
>>>>
>>>>>
>>>>> There are no tutorials for coding services that I know of, they've
>> never
>>>>> been asked for before, and none of us on the team are tech writers. The
>>>>> man pages, sources and examples are probably all you will find on the
>>>>> subject I'm afraid.
>>>>>
>>>>> For data sharing you might find the openais Ckpt service useful, but be
>>>>> wary of the other openais services, few are complete or well tested.
>>>>>
>>>>> Without knowing more about what you are planning to do it's hard to be
>>>>> more specific(and I wouldn't have the time anyway!). Pretty much all of
>>>>> the documentation you#ll find is about managing existing services using
>>>>> pacemaker and rgmanager which is what the vast majority of people seem
>>>>> to want to do.
>>>>>
>>>>> Chrissie
>>>>>
>>>>> _______________________________________________
>>>>> discuss mailing list
>>>>> discuss@xxxxxxxxxxxx
>>>>> http://lists.corosync.org/mailman/listinfo/discuss
>>>>
>>>> _______________________________________________
>>>> discuss mailing list
>>>> discuss@xxxxxxxxxxxx
>>>> http://lists.corosync.org/mailman/listinfo/discuss
>>>>
>>>
>>
>>
> 

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux