Re: kill -TERM does not stop corosync daemon

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Update.
According to the AMF log about a timeout, I can confirm that the node which had this issue could not receive mcast message even sent by itself at that time.  But I do not understand why it can receive JOIN message which result in pause detection.

在 2012-11-25 下午9:39,"jason" <huzhijiang@xxxxxxxxx>写道:

Hi All,
I currently encountered a publem with corosync-1.4.4 that kill -TERM does not stop corosync daemon. What I can confirm are:
1)  The thread of corosync_exit_thread_handler() is done and disappeared (confirmed with gdb info threads).  So the hooks into sched_work() which gets fired on token_send may not got chance to run(no token to send?)
2) I do not have firewall running when this ocurred.
3) No consensus timeout log before this publem happend.
4) I run gdb to attach to corosync, wasted some seconds, and when I continue to run it, I saw pause detection timer triggered(by check log),and after about 20 seconds, through the log I see both new confchg and service unload  hanppend simultaneously and finally corosync exited normally. I think it is the new token created by the new ring to make corosync exits finally,but I can not tell if the creation of new ring is influenced by my running of gdb or not.

This issue has not been reproduced but I am tring to. Could you help me to take look into this issue please?

Many thanks!

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux