lrmd fails here: mainloop_add_ipc_server(CRM_SYSTEM_LRMD, QB_IPC_SHM, &lrmd_ipc_callbacks); Calling the following function from /lib/common/mainloop.c -------8<-------- qb_ipcs_service_t *mainloop_add_ipc_server( const char *name, enum qb_ipc_type type, struct qb_ipcs_service_handlers *callbacks) { int rc = 0; qb_ipcs_service_t* server = NULL; if(gio_map == NULL) { gio_map = qb_array_create_2(64, sizeof(struct gio_to_qb_poll), 1); } server = qb_ipcs_create(name, 0, pick_ipc_type(type), callbacks); qb_ipcs_poll_handlers_set(server, &gio_poll_funcs); rc = qb_ipcs_run(server); if (rc < 0) { crm_err("Could not start %s IPC server: %s (%d)", name, strerror(rc), rc); return NULL; } return server; } -------------------------- I think a shared memory region should be created using libqb. Is this known to work on BSD systems? 2012/12/11 Stephan <stephanwib@xxxxxxxxxxxxxx>: > Yes, kqueues are not inherited. I recompiled and installed pacemaker > 1.1 for corosync 2.x. It doesn´t yet work (I just started pacemakerd.. > I hope this is okay) ... it seems that lrmd is facing the first issue: > > lrmd[15312]: error: mainloop_add_ipc_server: Could not start lrmd > IPC server: Unknown error: 4294967210 (-86) > > > All messages: > > -----8<---------- > Dec 11 14:22:31 ctx4980gate2 pacemakerd[13003]: info: > crm_update_callsites: Enabling callsites based on priority=6, > files=(null), functions=(null), formats=(null), tags=(null) > Dec 11 14:22:32 ctx4980gate2 corosync[18423]: [QB ] got EV_EOF on fd 20. > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > crm_add_logfile: Additional logging available in > /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: main: > Starting Pacemaker 1.1.8 (Build: 1f8858c): ncurses libqb-logging > libqb-ipc lha-fencing corosync-native > Dec 11 14:22:32 ctx4980gate2 corosync[18423]: [QB ] got EV_EOF on fd 18. > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > update_node_processes: 0x7f7ff7b09150 Node 3232235777 now known as > ctx4980gate2, was: > Dec 11 14:22:32 ctx4980gate2 cib[13836]: notice: crm_add_logfile: > Additional logging available in /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 cib[13836]: info: > crm_update_callsites: Enabling callsites based on priority=6, > files=(null), functions=(null), formats=(null), tags=(null) > Dec 11 14:22:32 ctx4980gate2 stonith-ng[7834]: notice: > crm_add_logfile: Additional logging available in > /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 stonith-ng[7834]: info: > crm_update_callsites: Enabling callsites based on priority=6, > files=(null), functions=(null), formats=(null), tags=(null) > Dec 11 14:22:32 ctx4980gate2 stonith-ng[7834]: notice: > crm_cluster_connect: Connecting to cluster infrastructure: corosync > Dec 11 14:22:32 ctx4980gate2 cib[13836]: notice: main: Using new > config location: /var/lib/pacemaker/cib > Dec 11 14:22:32 ctx4980gate2 cib[13836]: warning: retrieveCib: > Cluster configuration not found: /var/lib/pacemaker/cib/cib.xml > Dec 11 14:22:32 ctx4980gate2 cib[13836]: warning: readCibXmlFile: > Primary configuration corrupt or unusable, trying backup... > Dec 11 14:22:32 ctx4980gate2 cib[13836]: warning: readCibXmlFile: > Continuing with an empty configuration. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: notice: crm_add_logfile: > Additional logging available in /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: info: > crm_update_callsites: Enabling callsites based on priority=6, > files=(null), functions=(null), formats=(null), tags=(null) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 attrd[9542]: notice: crm_add_logfile: > Additional logging available in /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 pengine[17349]: notice: > crm_add_logfile: Additional logging available in > /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 attrd[9542]: notice: > crm_cluster_connect: Connecting to cluster infrastructure: corosync > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 pengine[17349]: error: > mainloop_add_ipc_server: Could not start pengine IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 pengine[17349]: error: main: Couldn't > start IPC server > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[15312]: error: main: Failed to > allocate lrmd server. shutting down > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: error: > pcmk_child_exit: Child process lrmd exited (pid=15312, rc=255) > Dec 11 14:22:32 ctx4980gate2 attrd[9542]: error: > qb_ipcs_us_publish: Could not bind AF_UNIX (/var/run/attrd): Address > already in use (48) > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > pcmk_child_exit: Respawning failed child process: lrmd > Dec 11 14:22:32 ctx4980gate2 attrd[9542]: error: > mainloop_add_ipc_server: Could not start attrd IPC server: Unknown > error: 4294967248 (-48) > Dec 11 14:22:32 ctx4980gate2 attrd[9542]: error: main: Could not > start IPC server > Dec 11 14:22:32 ctx4980gate2 attrd[9542]: error: main: Aborting startup > Dec 11 14:22:32 ctx4980gate2 corosync[18423]: [QB ] got EV_EOF on fd 26. > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: error: > pcmk_child_exit: Child process pengine exited (pid=17349, rc=1) > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > pcmk_child_exit: Respawning failed child process: pengine > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: error: > pcmk_child_exit: Child process attrd exited (pid=9542, rc=100) > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: warning: > pcmk_child_exit: Pacemaker child process attrd no longer wishes to be > respawned. Shutting ourselves down. > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > pcmk_shutdown_worker: Shuting down Pacemaker > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: stop_child: > Stopping crmd: Sent -15 to process 13681 > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > pcmk_child_exit: Child process crmd terminated with signal 15 > (pid=13681, core=0) > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: stop_child: > Stopping pengine: Sent -15 to process 10446 > Dec 11 14:22:32 ctx4980gate2 cib[13836]: notice: > crm_cluster_connect: Connecting to cluster infrastructure: corosync > Dec 11 14:22:32 ctx4980gate2 cib[13836]: error: qb_ipcs_us_publish: > Could not bind AF_UNIX (/var/run/cib_ro): Permission denied (13) > Dec 11 14:22:32 ctx4980gate2 cib[13836]: error: > mainloop_add_ipc_server: Could not start cib_ro IPC server: Unknown > error: 4294967283 (-13) > Dec 11 14:22:32 ctx4980gate2 cib[13836]: error: qb_ipcs_us_publish: > Could not bind AF_UNIX (/var/run/cib_rw): Permission denied (13) > Dec 11 14:22:32 ctx4980gate2 cib[13836]: error: > mainloop_add_ipc_server: Could not start cib_rw IPC server: Unknown > error: 4294967283 (-13) > Dec 11 14:22:32 ctx4980gate2 cib[13836]: error: > mainloop_add_ipc_server: Could not start cib_shm IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 cib[13836]: error: cib_init: Couldnt > start all IPC channels, exiting. > Dec 11 14:22:32 ctx4980gate2 corosync[18423]: [QB ] got EV_EOF on fd 26. > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: error: > pcmk_child_exit: Child process cib exited (pid=13836, rc=255) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: notice: crm_add_logfile: > Additional logging available in /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: info: > crm_update_callsites: Enabling callsites based on priority=6, > files=(null), functions=(null), formats=(null), tags=(null) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 pengine[10446]: notice: > crm_add_logfile: Additional logging available in > /var/log/cluster/corosync.log > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 pengine[10446]: error: > mainloop_add_ipc_server: Could not start pengine IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 pengine[10446]: error: main: Couldn't > start IPC server > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: > mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown > error: 4294967210 (-86) > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: error: > pcmk_child_exit: Child process pengine exited (pid=10446, rc=1) > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: try_server_create: > New IPC server could not be created because another lrmd process > exists, sending shutdown command to old lrmd process. > Dec 11 14:22:32 ctx4980gate2 lrmd[22677]: error: main: Failed to > allocate lrmd server. shutting down > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: stop_child: > Stopping lrmd: Sent -15 to process 22677 > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: > pcmk_child_exit: Child process lrmd terminated with signal 15 > (pid=22677, core=0) > Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]: notice: stop_child: > Stopping stonith-ng: Sent -15 to process 7834 > ----------------------------- > > 2012/12/11 Jan Friesse <jfriesse@xxxxxxxxxx>: >> Actually main problem is, that kqueue is created BEFORE fork, and >> according to man page, kqueue is NOT shared between process / child. >> Patch seems to be pretty easy and I will send it. >> >> Honza >> >> Stephan napsal(a): >>> Right, it works for me too when staring in foreground mode. I don´t >>> know if you have an idea what could cause this. But when running it in >>> daemon mode, it does apparently close its file descriptor to the >>> kevent queue somewhere. That does not happen when running in >>> foreground mode: >>> >>> corosync 2371 root 3u KQUEUE 0xfffffe84b41d7980 >>> >>> >>> >>> Regards, >>> >>> Stephan >>> >>> _______________________________________________ >>> discuss mailing list >>> discuss@xxxxxxxxxxxx >>> http://lists.corosync.org/mailman/listinfo/discuss >>> >> _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss