Re: shutdown seems to get hung up quite frequently

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Steven,

Do you have plan to port the new shutdown method in corosync-2.x back to corosync-1.4.x? When using corosync-1.4.5, we encountered  shutdown corosync by using kill -3 failed several times. The latest one is because when issuing kill -3, corosync_exit_sem had not been initialized by sem_init(), so sem_post() in corosync_shutdown_request() failed to trigger corosync_exit_thread_handler() to work. The resolution I think is simply to call the sem_init() before we install signal handler. But as you say, if corosync-2.x has more stronger mechanism for shutdown, why not port it back to 1.4.x?

On Feb 15, 2013 7:03 AM, "Steven Dake" <steven.dake@xxxxxxxxx> wrote:


On Thu, Feb 14, 2013 at 1:23 PM, Brian J. Murrell <brian.murrell@xxxxxxxxxxxxxxx> wrote:
On EL6, at least, trying to stop corosync (kill -TERM) seems to fail
quite frequently with corosync seemingly just not wanting to take heed
of the signal and exit.  corosync-cfgtool -H doesn't seem to work either
and I just end up killing it with a SIGKILL.

Shutdown has been a never-ending source of frustration for corosync, now solved with the 2.x series :)

The reason the TERM is not honored immediately is that Corosync wants to shut down in an orderly fashion on a TERM by quiescing services and shutting down cleanly with no pending messages.  Sometimes this is not possible quickly because the network is flaky or blocked in some way (such as iptables).

I had thought we had sorted all this out for 1.4 series though, so if you could provide more information on your corosync rpm version, that might be helpful.

 
Is a SIGKILL really the only way to deal with this problem?  Should this
need be codified into the initscript?  i.e. try SIGTERM and then SIGKILL
after a timeout?  What's a reasonable timeout for SIGTERM to have
worked?


sigterm should be honored by the corosync process rather then hacking around with a sigkill.
 
Cheers,
b.




_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux