Re: Questions about sync callbacks in the services

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok. Even I don't understand why not to have corosync on each node in
that case.
AFAIK, Totem protocol's performance is directly proportional to the size of ring (number of nodes participating in the ring). - The token loss timeout is calculated based on the number of nodes in the ring. As the number of nodes increases, the timeout also increases, so it reduces the failure detection and recovery time.
- The messaging delay increases with the number of nodes in the ring.

We have 12 blades per chassis, so the if we run corosync on each blade, the ring size would be 12 * <number of chassis> !

So it would reduce the performance.

CPG itself never sends message over ring. What is happening:
- IPC client wants to send cpg message
- test if no sync is in progress
- lock
- cpg sends message
- unlock

between lock and unlock sync cannot happen. When CPG service receives
message from totem, it will never reply back thru totem.


So in the above scenario, does corosync delay sending the group messages
while service is sending synchronize messages? How does it differentiate
between these messages?
It doesn't.

The use case for the above message sequence is that;

-- When a new node joins the ring and the CPG service is started on the
new node, all the other existing CPG services would want to let the new
node know about the existing groups and the group membership. So lets
say that CPG services are sending the updates to the new service about
their group membership and if they receive a groupLeave (for ex, the app
crashes), and they send a leave message to all the CPG services over
totem, we must ensure the proper ordering of the group updates and the
leave messages. Otherwise the services will go out of sync.

I didn't got question. Can you please try to come with some example?
As per my understanding, the corosync services would process any client request only when they receive it back over the ring. i.e.
- Client sends a request
- The request goes to the corresponding service on the local corosync daemon on the same node. - The service broadcasts the request over the ring, so that all the other services receive the request. The request message is also received back by the sender as well. - When each of the services receive the above request through the ring, they then process the request.

For example, the cpg_join() request from the client goes to the cpg service on the same node through qb IPC interface. The cpg service that receives the "cpg_join" request from the client, broadcasts this request over the ring to all the CPG services on the ring. So every cpg service on the ring (including the sender) will receive this message back in msgDelivery callback, and then they process the join request. i.e. they add the new member to the group and send out the configChange callback to each of the members in the ring.

So lets consider the following scenario:

- An application makes invokes cpg_leave() API.
- The request goes to local cpg_service, which broadcasts this message to all the other cpg services by invoking cpg_node_joinleave_send. Note that the member will be removed from the group data structure only when all the services receive this message back through the message delivery callback. - Meanwhile, if another node joins and a sync is invoked, it is possible that the nodes send the sync message including the member whose leave request is begin processed (the request is received, and sent over the ring, but haven't been received back yet).
- So the order of messages received by the new node could be;
    -- It receives leave message for the member
    -- Then it receives sync message from other nodes

In the above sequence, the cpg_service on the new node, does not know about the member when it receives the leave message first, so it might just discard that message. It then receives sync and updates its data structure with the member who has already left.

The other nodes would receive the leave request and remove the member from their data structure, while the new node will keep the member as it missed the leave message! So it would lead to inconsistent view of the groups across the nodes. Isn't it?

Regards,
Shridhar

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux