Hi Jan, Here is my patch against corosync-1.4.5. diff -ruNp corosync-1.4.5-orig/exec/main.c corosync-1.4.5/exec/main.c --- corosync-1.4.5-orig/exec/main.c 2012-12-12 18:47:52.000000000 +0800 +++ corosync-1.4.5/exec/main.c 2013-02-26 20:48:48.937500000 +0800 @@ -1620,6 +1620,14 @@ int main (int argc, char **argv, char ** log_printf (LOGSYS_LEVEL_NOTICE, "Corosync Cluster Engine ('%s'): started and ready to provide service.\n", VERSION); log_printf (LOGSYS_LEVEL_INFO, "Corosync built-in features:" PACKAGE_FEATURES "\n"); + /* + * Create exit sempahore. + */ + res = sem_init (&corosync_exit_sem, 0, 0); + if (res != 0) { + log_printf (LOGSYS_LEVEL_ERROR, "Corosync Executive couldn't create exit sempahore.\n"); + corosync_exit_error (AIS_DONE_FATAL_ERR); + } (void)signal (SIGINT, sigintr_handler); (void)signal (SIGUSR2, sigusr2_handler); @@ -1803,14 +1811,8 @@ int main (int argc, char **argv, char ** // TODO what is this hack for? usleep(totem_config.token_timeout * 2000); /* - * Create semaphore and start "exit" thread + * Start "exit" thread */ - res = sem_init (&corosync_exit_sem, 0, 0); - if (res != 0) { - log_printf (LOGSYS_LEVEL_ERROR, "Corosync Executive couldn't create exit thread.\n"); - corosync_exit_error (AIS_DONE_FATAL_ERR); - } - res = pthread_create (&corosync_exit_thread, NULL, corosync_exit_thread_handler, NULL); if (res != 0) { log_printf (LOGSYS_LEVEL_ERROR, "Corosync Executive couldn't create exit thread.\n"); On Mon, Feb 25, 2013 at 5:14 PM, Jan Friesse <jfriesse@xxxxxxxxxx> wrote: > jason napsal(a): >> Hi Steven, >> >> Do you have plan to port the new shutdown method in corosync-2.x back to >> corosync-1.4.x? When using corosync-1.4.5, we encountered shutdown > > It's almost impossible. Actually, shutdown sequence itself didn't > changed much. What did is usage of threads (or correctly said, no > threads in 2.x). > >> corosync by using kill -3 failed several times. The latest one is because >> when issuing kill -3, corosync_exit_sem had not been initialized by >> sem_init(), so sem_post() in corosync_shutdown_request() failed to trigger >> corosync_exit_thread_handler() to work. The resolution I think is simply to >> call the sem_init() before we install signal handler. But as you say, if > > Can you send patch? > >> corosync-2.x has more stronger mechanism for shutdown, why not port it back >> to 1.4.x? >> On Feb 15, 2013 7:03 AM, "Steven Dake" <steven.dake@xxxxxxxxx> wrote: >> > > Honza > >>> >>> >>> On Thu, Feb 14, 2013 at 1:23 PM, Brian J. Murrell < >>> brian.murrell@xxxxxxxxxxxxxxx> wrote: >>> >>>> On EL6, at least, trying to stop corosync (kill -TERM) seems to fail >>>> quite frequently with corosync seemingly just not wanting to take heed >>>> of the signal and exit. corosync-cfgtool -H doesn't seem to work either >>>> and I just end up killing it with a SIGKILL. >>>> >>>> Shutdown has been a never-ending source of frustration for corosync, now >>> solved with the 2.x series :) >>> >>> The reason the TERM is not honored immediately is that Corosync wants to >>> shut down in an orderly fashion on a TERM by quiescing services and >>> shutting down cleanly with no pending messages. Sometimes this is not >>> possible quickly because the network is flaky or blocked in some way (such >>> as iptables). >>> >>> I had thought we had sorted all this out for 1.4 series though, so if you >>> could provide more information on your corosync rpm version, that might be >>> helpful. >>> >>> >>> >>>> Is a SIGKILL really the only way to deal with this problem? Should this >>>> need be codified into the initscript? i.e. try SIGTERM and then SIGKILL >>>> after a timeout? What's a reasonable timeout for SIGTERM to have >>>> worked? >>>> >>>> >>> sigterm should be honored by the corosync process rather then hacking >>> around with a sigkill. >>> >>> >>>> Cheers, >>>> b. >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> discuss mailing list >>>> discuss@xxxxxxxxxxxx >>>> http://lists.corosync.org/mailman/listinfo/discuss >>>> >>> >>> >>> _______________________________________________ >>> discuss mailing list >>> discuss@xxxxxxxxxxxx >>> http://lists.corosync.org/mailman/listinfo/discuss >>> >>> >> >> >> >> _______________________________________________ >> discuss mailing list >> discuss@xxxxxxxxxxxx >> http://lists.corosync.org/mailman/listinfo/discuss > -- Yours, Jason _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss