Re: [PATCH] From: Yunkai Zhang: Today, I have observed one of the reason that corosync running into FAILED TO RECEIVE state.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You are so quickly, thanks:)

发自我的 iPhone

在 2011-11-29,22:43,Steven Dake <sdake@xxxxxxxxxx> 写道:

> There was five nodes(A,B,C,D,E) in my testing, and I limited the UDP
> transmission rate of C nodes by iptables command:
> iptables -A INPUT -i eth0 -p udp -m limit --limit 10000/s
> --limit-burst 1 -j ACCEPT
> iptables -A INPUT -i eth0 -p udp -j DROP
> 
> After one hour later, C node had been missing some MCAST messages,
> it's state described as following:
> ==state of C node==
> my_aru:0x805
> my_high_seq_received:0xC2C
> my_aru_count:7
> 
> =>receved MCAST message with seq:806 from B nodes
> =>enter *message_handler_mcast*
>  =>add this message to regular_sort_queue
>  ...
>  =>enter *update_aru* function
>    => range = (my_high_seq_received - my_aru)
>             = (0xC2C - 0x805)
>             = 1063
>    => if range>1024, do nothing and and return directly.
> ==END==
> 
> According this logic, after (my_high_req_received-my_aru)>1024, my_aru
> will not be updated though corosync can receive MCAST messages
> retransmitted by other nodes.
> 
> But at that timte, my_aru_count was only 7. So the corosync at C node
> would keep in this status until my_aru_count increased to
> fail_to_recv_const(the default value is 2500). This was a long time
> for corosync, but we wasted it.
> 
> To solve this issue, maybe we can enlarge the range condition in
> update_aru function? Or we just ingnore the checking of range value,
> it seems no harmfull, because we have been using fail_to_recv_const to
> control the things.
> 
> Signed-off-by: Steven Dake <sdake@xxxxxxxxxx>
> ---
> exec/totemsrp.c |    3 ---
> 1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/exec/totemsrp.c b/exec/totemsrp.c
> index 4011514..f29de29 100644
> --- a/exec/totemsrp.c
> +++ b/exec/totemsrp.c
> @@ -2417,9 +2417,6 @@ static void update_aru (
>    }
> 
>    range = instance->my_high_seq_received - instance->my_aru;
> -    if (range > 1024) {
> -        return;
> -    }
> 
>    my_aru_saved = instance->my_aru;
>    for (i = 1; i <= range; i++) {
> -- 
> 1.7.7.3
> 
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss



[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux