Re: Fwd: [PATCH] [totemrrp] Reset timer_problem_decrementer to zero in active_timer_problem_decrementer_cancel()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jason,
nice catch. I've pushed commit into master and flatiron.

Thanks,
  Honza
> Hi All,
> Please help to take a look at is patch. Thanks!
> ---------- 转发的邮件 ----------
> 发件人:"Jason" <huzhijiang@xxxxxxxxx>
> 日期:2014年11月10日 下午9:26
> 主题:[PATCH] [totemrrp] Reset timer_problem_decrementer to zero in
> active_timer_problem_decrementer_cancel()
> 收件人:"Jason" <discuss@xxxxxxxxxxxx>
> 抄送:
> 
> After a heartbeat link's FAULTY and its auto re-enable,
> active_instance->timer_problem_decrementer did not reset to zero. So in the
> next timer_function_active_token_expired() round,
> active_timer_problem_decrementer_start() will not be called. This will
> result in that the active_instance->counter_problems of this link can not
> be decreased any more. Cause rrp lose the ability to tolerate network
> fluctuation.
> 
> This problem can be reproduced by the following sequence:
> 1) Set RRP in active mode, configure at least 2 heartbeat links.
> 2) Unplug one link till corosync-cfgtool -s shows it is FAULTY.
> 3) Re-plug this link then corosync-cfgtool -s shows it is active with no
> faults.
> 4) Unplug this link again but quicky re-plug it before it becomes FAULTY.
> 5) Finally, you can see corosync-cfgtool -s shows it is in "Incrementing
> problem counter" state despite it currently is physically healthy.
> 
> It can be solved by not forget to reset timer_problem_decrementer to zero
> in active_timer_problem_decrementer_cancel().
> 
> 
> Signed-off-by: Jason <huzhijiang@xxxxxxxxx>
> ---
>  exec/totemrrp.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/exec/totemrrp.c b/exec/totemrrp.c
> index 95a789e..a798bba 100644
> --- a/exec/totemrrp.c
> +++ b/exec/totemrrp.c
> @@ -1542,6 +1542,7 @@ static void active_timer_problem_decrementer_cancel (
>          qb_loop_timer_del (
>                 active_instance->rrp_instance->poll_handle,
>                 active_instance->timer_problem_decrementer);
> +               active_instance->timer_problem_decrementer = 0;
>  }
> 
> 
> --
> 1.9.4.msysgit.2
> 
> 
> 
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss
> 

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss





[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux