Hello all, I'm sometimes having trouble when starting ccsd and then gulm under heavy CPU load. Ccsd's init script tells it is running but it's not fully initialized. The problem comes from the fact that ccsd's main process returns before the daemonized process of ccsd has finished initializing its sockets. The "cluster_communicator" thread sends a SIGTERM message to the parent process before the main thread has finished its initialization work. With the patch proposed in attachement, the cluster_communicator is started after the main thread has finished initializing. It works well under any load. Any daemon that needs to connect ccsd will then succceed. It was tested with cluster-1.03, but it should work with older versions, the ccsd files didn't seem to have changed much. -- Mathieu Avila
Index: cluster/ccs/daemon/ccsd.c =================================================================== --- cluster/ccs/daemon/ccsd.c (révision 20936) +++ cluster/ccs/daemon/ccsd.c (copie de travail) @@ -74,11 +74,6 @@ free(msg); } - if(start_cluster_monitor_thread()){ - log_err("Unable to create thread.\n"); - exit(EXIT_FAILURE); - } - memset(&addr, 0, sizeof(struct sockaddr_storage)); /** Setup the socket to communicate with the CCS library **/ @@ -177,6 +172,11 @@ if (sfds[2] >= 0) FD_SET(sfds[2], &rset); + if(start_cluster_monitor_thread()){ + log_err("Unable to create thread.\n"); + exit(EXIT_FAILURE); + } + while(1){ int len = addr_size;
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster