Re: Corosync/Pacemaker on NetBSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



lrmd fails here:

mainloop_add_ipc_server(CRM_SYSTEM_LRMD, QB_IPC_SHM, &lrmd_ipc_callbacks);


Calling the following function from /lib/common/mainloop.c
-------8<--------
qb_ipcs_service_t *mainloop_add_ipc_server(
    const char *name, enum qb_ipc_type type, struct
qb_ipcs_service_handlers *callbacks)
{
    int rc = 0;
    qb_ipcs_service_t* server = NULL;

    if(gio_map == NULL) {
        gio_map = qb_array_create_2(64, sizeof(struct gio_to_qb_poll), 1);
    }

    server = qb_ipcs_create(name, 0, pick_ipc_type(type), callbacks);
    qb_ipcs_poll_handlers_set(server, &gio_poll_funcs);

    rc = qb_ipcs_run(server);
    if (rc < 0) {
        crm_err("Could not start %s IPC server: %s (%d)", name,
strerror(rc), rc);
        return NULL;
    }

    return server;
}

--------------------------

I think a shared memory region should be created using libqb. Is this
known to work on BSD systems?


2012/12/11 Stephan <stephanwib@xxxxxxxxxxxxxx>:
> Yes, kqueues are not inherited. I recompiled and installed pacemaker
> 1.1 for corosync 2.x. It doesn´t yet work (I just started pacemakerd..
> I hope this is okay) ... it seems that lrmd is facing the first issue:
>
> lrmd[15312]:    error: mainloop_add_ipc_server: Could not start lrmd
> IPC server: Unknown error: 4294967210 (-86)
>
>
> All messages:
>
> -----8<----------
> Dec 11 14:22:31 ctx4980gate2 pacemakerd[13003]:     info:
> crm_update_callsites: Enabling callsites based on priority=6,
> files=(null), functions=(null), formats=(null), tags=(null)
> Dec 11 14:22:32 ctx4980gate2 corosync[18423]:   [QB    ] got EV_EOF on fd 20.
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> crm_add_logfile: Additional logging available in
> /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice: main:
> Starting Pacemaker 1.1.8 (Build: 1f8858c):  ncurses libqb-logging
> libqb-ipc lha-fencing  corosync-native
> Dec 11 14:22:32 ctx4980gate2 corosync[18423]:   [QB    ] got EV_EOF on fd 18.
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> update_node_processes: 0x7f7ff7b09150 Node 3232235777 now known as
> ctx4980gate2, was:
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:     info:
> crm_update_callsites: Enabling callsites based on priority=6,
> files=(null), functions=(null), formats=(null), tags=(null)
> Dec 11 14:22:32 ctx4980gate2 stonith-ng[7834]:   notice:
> crm_add_logfile: Additional logging available in
> /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 stonith-ng[7834]:     info:
> crm_update_callsites: Enabling callsites based on priority=6,
> files=(null), functions=(null), formats=(null), tags=(null)
> Dec 11 14:22:32 ctx4980gate2 stonith-ng[7834]:   notice:
> crm_cluster_connect: Connecting to cluster infrastructure: corosync
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:   notice: main: Using new
> config location: /var/lib/pacemaker/cib
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:  warning: retrieveCib:
> Cluster configuration not found: /var/lib/pacemaker/cib/cib.xml
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:  warning: readCibXmlFile:
> Primary configuration corrupt or unusable, trying backup...
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:  warning: readCibXmlFile:
> Continuing with an empty configuration.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:     info:
> crm_update_callsites: Enabling callsites based on priority=6,
> files=(null), functions=(null), formats=(null), tags=(null)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 attrd[9542]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 pengine[17349]:   notice:
> crm_add_logfile: Additional logging available in
> /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 attrd[9542]:   notice:
> crm_cluster_connect: Connecting to cluster infrastructure: corosync
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 pengine[17349]:    error:
> mainloop_add_ipc_server: Could not start pengine IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 pengine[17349]:    error: main: Couldn't
> start IPC server
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[15312]:    error: main: Failed to
> allocate lrmd server.  shutting down
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:    error:
> pcmk_child_exit: Child process lrmd exited (pid=15312, rc=255)
> Dec 11 14:22:32 ctx4980gate2 attrd[9542]:    error:
> qb_ipcs_us_publish: Could not bind AF_UNIX (/var/run/attrd): Address
> already in use (48)
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> pcmk_child_exit: Respawning failed child process: lrmd
> Dec 11 14:22:32 ctx4980gate2 attrd[9542]:    error:
> mainloop_add_ipc_server: Could not start attrd IPC server: Unknown
> error: 4294967248 (-48)
> Dec 11 14:22:32 ctx4980gate2 attrd[9542]:    error: main: Could not
> start IPC server
> Dec 11 14:22:32 ctx4980gate2 attrd[9542]:    error: main: Aborting startup
> Dec 11 14:22:32 ctx4980gate2 corosync[18423]:   [QB    ] got EV_EOF on fd 26.
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:    error:
> pcmk_child_exit: Child process pengine exited (pid=17349, rc=1)
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> pcmk_child_exit: Respawning failed child process: pengine
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:    error:
> pcmk_child_exit: Child process attrd exited (pid=9542, rc=100)
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:  warning:
> pcmk_child_exit: Pacemaker child process attrd no longer wishes to be
> respawned. Shutting ourselves down.
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> pcmk_shutdown_worker: Shuting down Pacemaker
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice: stop_child:
> Stopping crmd: Sent -15 to process 13681
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> pcmk_child_exit: Child process crmd terminated with signal 15
> (pid=13681, core=0)
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice: stop_child:
> Stopping pengine: Sent -15 to process 10446
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:   notice:
> crm_cluster_connect: Connecting to cluster infrastructure: corosync
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:    error: qb_ipcs_us_publish:
> Could not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:    error:
> mainloop_add_ipc_server: Could not start cib_ro IPC server: Unknown
> error: 4294967283 (-13)
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:    error: qb_ipcs_us_publish:
> Could not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:    error:
> mainloop_add_ipc_server: Could not start cib_rw IPC server: Unknown
> error: 4294967283 (-13)
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:    error:
> mainloop_add_ipc_server: Could not start cib_shm IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 cib[13836]:    error: cib_init: Couldnt
> start all IPC channels, exiting.
> Dec 11 14:22:32 ctx4980gate2 corosync[18423]:   [QB    ] got EV_EOF on fd 26.
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:    error:
> pcmk_child_exit: Child process cib exited (pid=13836, rc=255)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:   notice: crm_add_logfile:
> Additional logging available in /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:     info:
> crm_update_callsites: Enabling callsites based on priority=6,
> files=(null), functions=(null), formats=(null), tags=(null)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 pengine[10446]:   notice:
> crm_add_logfile: Additional logging available in
> /var/log/cluster/corosync.log
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 pengine[10446]:    error:
> mainloop_add_ipc_server: Could not start pengine IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 pengine[10446]:    error: main: Couldn't
> start IPC server
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error:
> mainloop_add_ipc_server: Could not start lrmd IPC server: Unknown
> error: 4294967210 (-86)
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:    error:
> pcmk_child_exit: Child process pengine exited (pid=10446, rc=1)
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: try_server_create:
> New IPC server could not be created because another lrmd process
> exists, sending shutdown command to old lrmd process.
> Dec 11 14:22:32 ctx4980gate2 lrmd[22677]:    error: main: Failed to
> allocate lrmd server.  shutting down
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice: stop_child:
> Stopping lrmd: Sent -15 to process 22677
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice:
> pcmk_child_exit: Child process lrmd terminated with signal 15
> (pid=22677, core=0)
> Dec 11 14:22:32 ctx4980gate2 pacemakerd[13003]:   notice: stop_child:
> Stopping stonith-ng: Sent -15 to process 7834
> -----------------------------
>
> 2012/12/11 Jan Friesse <jfriesse@xxxxxxxxxx>:
>> Actually main problem is, that kqueue is created BEFORE fork, and
>> according to man page, kqueue is NOT shared between process / child.
>> Patch seems to be pretty easy and I will send it.
>>
>> Honza
>>
>> Stephan napsal(a):
>>> Right, it works for me too when staring in foreground mode. I don´t
>>> know if you have an idea what could cause this. But when running it in
>>> daemon mode, it does apparently close its file descriptor to the
>>> kevent queue somewhere. That does not happen when running in
>>> foreground mode:
>>>
>>> corosync 2371 root    3u  KQUEUE 0xfffffe84b41d7980
>>>
>>>
>>>
>>> Regards,
>>>
>>> Stephan
>>>
>>> _______________________________________________
>>> discuss mailing list
>>> discuss@xxxxxxxxxxxx
>>> http://lists.corosync.org/mailman/listinfo/discuss
>>>
>>

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss



[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux