Re: Corosync 2.3 dies randomly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/08/2013 09:04 AM, Robert Parsons wrote:

I'm running a 14-node pacemaker/corosync cluster. I recently ditched the Ubuntu 12.04 corosync and pacemaker pages opting to compile from source instead. I successfully built Pacemaker 1.1.9 and Corosync 2.3.0, however, Corosync often does not start on boot. Even worse, it tends to die randomly once it is started, leaving no error messages in the log. I can downgrade to Corosync 1.4.5 and all is fine. I have no explanation as to why 2.3.0 fails so frequently. I've included some debugging information. Any help would be much appreciated.

Thanks.

- Rob P.


Rob,

I assume your running on x86_64 and not some other architecture. Could you verify?

Typically a bus error occurs because the mmap() operations in libqb are failing in some way on x86_64 architectures. In the past, this is because the tmpfs that stores the memory maps is not large enough. Are you using latest master of libqb?

Regards
-steve


strace output

sendmsg(11, {msg_name(16)={sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("226.94.1.3")}, msg_iov(1)=[{"\376\376\0\0\2\0\"\377w\4\1\n\2w\4\1\n\2\0\n\1\4w\0\0\0\0\0\0\0\0\0"..., 87}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 87 sendmsg(10, {msg_name(0)=NULL, msg_iov(1)=[{"\376\376\0\0\2\0\"\377w\4\1\n\2w\4\1\n\2\0\n\1\4w\0\0\0\0\0\0\0\0\0"..., 87}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 87
clock_gettime(CLOCK_MONOTONIC, {57355, 710432961}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 710674580}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 711018759}) = 0
epoll_wait(4, {{EPOLLIN, {u32=2, u64=5813885752995479554}}}, 12, 0) = 1
clock_gettime(CLOCK_MONOTONIC, {57355, 711595469}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 711888254}) = 0
epoll_wait(4, {{EPOLLIN, {u32=2, u64=5813885752995479554}}}, 12, 257) = 1
clock_gettime(CLOCK_MONOTONIC, {57355, 712421751}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 712715186}) = 0
recvmsg(9, {msg_name(0)={sa_family=0x7c50 /* AF_??? */, sa_data="\234\217A\177\0\0\20s\234\217A\177\0\0"}, msg_iov(1)=[{"\376\376\0\0\2\0\"\377w\4\1\n\2w\4\1\n\2\0\n\1\4w\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 87
clock_gettime(CLOCK_MONOTONIC, {57355, 713490334}) = 0
epoll_wait(4, {}, 12, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 714051544}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 714348259}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 714558026}) = 0
epoll_wait(4, {}, 12, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 715203138}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 715411760}) = 0
epoll_wait(4, {{EPOLLIN, {u32=1, u64=7284734596711710721}}}, 12, 254) = 1
clock_gettime(CLOCK_MONOTONIC, {57355, 930356426}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 930649677}) = 0
recvmsg(8, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.124")}, msg_iov(1)=[{"\376\376\0\0\5\0\"\377|\4\1\nw\4\1\n\2\0\n\1\4w\0\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 42 sendmsg(11, {msg_name(16)={sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("10.1.4.121")}, msg_iov(1)=[{"\376\376\0\0\0\0\"\377w\4\1\n\256\0\0\0\10=\0\0\256\0\0\0\0\0\0\0w\4\1\n"..., 74}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 74
clock_gettime(CLOCK_MONOTONIC, {57355, 931677894}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 931933853}) = 0
epoll_wait(4, {}, 12, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 932546611}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 932960908}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 933318531}) = 0
epoll_wait(4, {}, 12, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 933818440}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 934122715}) = 0
epoll_wait(4, {{EPOLLIN, {u32=1, u64=7284734596711710721}}}, 12, 35) = 1
clock_gettime(CLOCK_MONOTONIC, {57355, 939059907}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 939326101}) = 0
recvmsg(8, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.124")}, msg_iov(1)=[{"\376\376\0\0\1\2\"\377|\4\1\n\2|\4\1\n\2\0\n\1\4|\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 1472
clock_gettime(CLOCK_MONOTONIC, {57355, 939958334}) = 0
epoll_wait(4, {{EPOLLIN, {u32=1, u64=7284734596711710721}}, {EPOLLIN, {u32=3, u64=8736411904714473475}}}, 12, 0) = 2
clock_gettime(CLOCK_MONOTONIC, {57355, 940609120}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 940881911}) = 0
recvmsg(8, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.124")}, msg_iov(1)=[{"\376\376\0\0\1\2\"\377|\4\1\n\2|\4\1\n\2\0\n\1\4|\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 617
sendto(23, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1
recvmsg(12, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.133")}, msg_iov(1)=[{"\376\376\0\0\0\0\"\377\205\4\1\n\262\0\0\0\20=\0\0\262\0\0\0\0\0\0\0w\4\1\n"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 74
poll([{fd=8, events=POLLIN}], 1, 0)     = 1 ([{fd=8, revents=POLLIN}])
recvmsg(8, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.133")}, msg_iov(1)=[{"\376\376\0\0\5\0\"\377\205\4\1\nw\4\1\n\2\0\n\1\4w\0\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 42 sendmsg(11, {msg_name(16)={sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("10.1.4.121")}, msg_iov(1)=[{"\376\376\0\0\0\0\"\377w\4\1\n\256\0\0\0\10=\0\0\256\0\0\0\0\0\0\0w\4\1\n"..., 74}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 74
clock_gettime(CLOCK_MONOTONIC, {57355, 943922220}) = 0
poll([{fd=8, events=POLLIN}], 1, 0)     = 1 ([{fd=8, revents=POLLIN}])
recvmsg(8, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.133")}, msg_iov(1)=[{"\376\376\0\0\1\2\"\377\205\4\1\n\2\205\4\1\n\2\0\n\1\4\205\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 1472
poll([{fd=8, events=POLLIN}], 1, 0)     = 1 ([{fd=8, revents=POLLIN}])
recvmsg(8, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.133")}, msg_iov(1)=[{"\376\376\0\0\1\2\"\377\205\4\1\n\2\205\4\1\n\2\0\n\1\4\205\0\0\0\0\0\0\0\0\0"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 584
sendto(23, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1
poll([{fd=8, events=POLLIN}], 1, 0)     = 0 (Timeout)
poll([{fd=9, events=POLLIN}], 1, 0)     = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {57355, 946686072}) = 0
sendmsg(11, {msg_name(16)={sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("10.1.4.121")}, msg_iov(1)=[{"\376\376\0\0\0\0\"\377w\4\1\n\262\0\0\0\21=\0\0\262\0\0\0\0\0\0\0w\4\1\n"..., 74}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 74
clock_gettime(CLOCK_MONOTONIC, {57355, 952769552}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 952985732}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 953217383}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 953489345}) = 0
epoll_wait(4, {}, 12, 0)                = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 954019395}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 954314404}) = 0
epoll_wait(4, {{EPOLLIN, {u32=3, u64=8736411904714473475}}}, 12, 586) = 1
clock_gettime(CLOCK_MONOTONIC, {57355, 954907616}) = 0
clock_gettime(CLOCK_MONOTONIC, {57355, 955138096}) = 0
recvmsg(12, {msg_name(16)={sa_family=AF_INET, sin_port=htons(8999), sin_addr=inet_addr("10.1.4.133")}, msg_iov(1)=[{"\376\376\0\0\0\0\"\377\205\4\1\n\262\0\0\0\31=\0\0\262\0\0\0\0\0\0\0w\4\1\n"..., 10000}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 74
clock_gettime(CLOCK_MONOTONIC, {57355, 955849968}) = 0
poll([{fd=8, events=POLLIN}], 1, 0)     = 0 (Timeout)
poll([{fd=9, events=POLLIN}], 1, 0)     = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {57355, 956746383}) = 0
--- SIGBUS (Bus error) @ 0 (0) ---
Process 30095 detached





corosync-blackbox output


root@krusty:~# corosync-blackbox
Failed to initialize the cmap API. Error CS_ERR_LIBRARY
Failed to initialize the cmap API. Error CS_ERR_LIBRARY
Dumping the contents of /var/lib/corosync/fdata
[debug] shm size:8388608; real_size:8388608; rb->word_size:2097152
[debug] read total of: 8388620
Ringbuffer:
 ->NORMAL
 ->write_pt [673]
 ->read_pt [0]
 ->size [2097152 words]
 =>free [8385912 bytes]
 =>used [2692 bytes]
debug May 07 15:26:52 handle_new_connection(476):2147483648: IPC credentials authenticated (13438-14116-18) debug May 07 15:26:52 qb_ipcs_shm_connect(294):9: connecting to client [14116] debug May 07 15:26:52 qb_rb_open_2(226):9: shm size:zd; real_size:zd; rb->word_size:1048576 debug May 07 15:26:52 qb_rb_open_2(226):9: shm size:zd; real_size:zd; rb->word_size:1048576 debug May 07 15:26:52 qb_rb_open_2(226):9: shm size:zd; real_size:zd; rb->word_size:1048576 debug May 07 15:26:52 cs_ipcs_connection_created(269):8: connection created debug May 07 15:26:52 cmap_lib_init_fn(306):9: lib_init_fn: conn=0x7feb08b0ea00 debug May 07 15:26:52 qb_ipcs_dispatch_connection_request(723):9: HUP conn (13438-14116-18) debug May 07 15:26:52 qb_ipcs_disconnect(565):9: qb_ipcs_disconnect(13438-14116-18) state:2 debug May 07 15:26:52 _del(117):9: epoll_ctl(del): Bad file descriptor (9) debug May 07 15:26:52 cs_ipcs_connection_closed(414):8: cs_ipcs_connection_closed() debug May 07 15:26:52 cmap_lib_exit_fn(325):9: exit_fn for conn=0x7feb08b0ea00 debug May 07 15:26:52 cs_ipcs_connection_destroyed(387):8: cs_ipcs_connection_destroyed()
trace   May 07 15:26:52 qb_rb_close(279):9: ENTERING qb_rb_close()
trace May 07 15:26:52 my_posix_sem_destroy(91):9: ENTERING my_posix_sem_destroy() debug May 07 15:26:52 qb_rb_close(290):9: Free'ing ringbuffer: /dev/shm/qb-cmap-response-13438-14116-18-header
trace   May 07 15:26:52 qb_rb_close(279):9: ENTERING qb_rb_close()
trace May 07 15:26:52 my_posix_sem_destroy(91):9: ENTERING my_posix_sem_destroy() debug May 07 15:26:52 qb_rb_close(290):9: Free'ing ringbuffer: /dev/shm/qb-cmap-event-13438-14116-18-header
trace   May 07 15:26:52 qb_rb_close(279):9: ENTERING qb_rb_close()
trace May 07 15:26:52 my_posix_sem_destroy(91):9: ENTERING my_posix_sem_destroy() debug May 07 15:26:52 qb_rb_close(290):9: Free'ing ringbuffer: /dev/shm/qb-cmap-request-13438-14116-18-header debug May 07 15:26:52 handle_new_connection(476):2147483648: IPC credentials authenticated (13438-14118-18) debug May 07 15:26:52 qb_ipcs_shm_connect(294):9: connecting to client [14118] debug May 07 15:26:52 qb_rb_open_2(226):9: shm size:zd; real_size:zd; rb->word_size:1048576 debug May 07 15:26:52 qb_rb_open_2(226):9: shm size:zd; real_size:zd; rb->word_size:1048576 debug May 07 15:26:52 qb_rb_open_2(226):9: shm size:zd; real_size:zd; rb->word_size:1048576 debug May 07 15:26:52 cs_ipcs_connection_created(269):8: connection created debug May 07 15:26:52 cmap_lib_init_fn(306):9: lib_init_fn: conn=0x7feb08b0ea00
ERROR: qb_rb_chunk_read failed: Connection timed out
[trace] ENTERING qb_rb_close()
[debug] Free'ing ringbuffer: /dev/shm/qb-create_from_file-header




core dump

Reading symbols from /usr/sbin/corosync...done.
[New LWP 30095]
[New LWP 30096]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `corosync'.
Program terminated with signal 7, Bus error.
#0 0x00007f418e4ea823 in qb_rb_chunk_alloc (rb=0x7f418f9c0d20, len=<optimized out>) at ringbuffer.c:456
456             rb->shared_data[write_pt] = 0;

(gdb) thread apply all bt

Thread 2 (Thread 0x7f418bf6f700 (LWP 30096)):
#0 0x00007f418e2cdfd0 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f418e4f45f3 in qb_logt_worker_thread (data=<optimized out>) at log_thread.c:71 #2 0x00007f418e2c7e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007f418dff3cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f418ed96700 (LWP 30095)):
#0 0x00007f418e4ea823 in qb_rb_chunk_alloc (rb=0x7f418f9c0d20, len=<optimized out>) at ringbuffer.c:456 #1 0x00007f418e4f4bb9 in _blackbox_vlogger (target=<optimized out>, cs=0x7f418fa3e4d8, timestamp=1368027286,
    ap=0x7fff3408ebf8) at log_blackbox.c:74
#2 0x00007f418e4f2dea in qb_log_real_va_ (cs=0x7f418fa3e4d8, ap=0x7fff3408ee88) at log.c:184 #3 0x00007f418edbdf00 in _logsys_log_printf (level=<optimized out>, subsys=<optimized out>, function_name=<optimized out>, file_name=<optimized out>, file_line=<optimized out>, format=<optimized out>)
    at main.c:898
#4 0x00007f418e95f7bd in messages_free (token_aru=<optimized out>, instance=0x7f418b732010) at totemsrp.c:2462 #5 message_handler_orf_token (instance=0x7f418b732010, msg=<optimized out>, endian_conversion_needed=<optimized out>,
    msg_len=<optimized out>) at totemsrp.c:3611
#6 0x00007f418e9617f7 in message_handler_orf_token (instance=<optimized out>, msg=<optimized out>, msg_len=<optimized out>, endian_conversion_needed=<optimized out>) at totemsrp.c:3532 #7 0x00007f418e95bcc1 in rrp_deliver_fn (context=0x7f418f9ce850, msg=0x7f418fa0f868, msg_len=70) at totemrrp.c:1783 #8 0x00007f418e956ad2 in net_deliver_fn (fd=<optimized out>, revents=<optimized out>, data=0x7f418fa0f800)
    at totemudp.c:521
#9 0x00007f418e4ec05f in _poll_dispatch_and_take_back_ (item=<optimized out>, p=<optimized out>) at loop_poll.c:98 #10 0x00007f418e4ebbd7 in qb_loop_run_level (level=0x7f418f9c7268) at loop.c:43
#11 qb_loop_run (lp=<optimized out>) at loop.c:204
#12 0x00007f418edacbe5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at main.c:1309





syslog


May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838851 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[167838851]: votes: 1, expected: 17 flags: 1 May 8 10:26:23 krusty corosync[30095]: [QUORUM] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No May 8 10:26:23 krusty corosync[30095]: [QUORUM] total_votes=8, expected_votes=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838839 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838841 state=2, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838842 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838844 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838846 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838849 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838850 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838851 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838853 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] quorum lost, blocking activity May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838851 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838853 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[167838853]: votes: 1, expected: 17 flags: 1 May 8 10:26:23 krusty corosync[30095]: [QUORUM] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No May 8 10:26:23 krusty corosync[30095]: [QUORUM] total_votes=8, expected_votes=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838839 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838841 state=2, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838842 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838844 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838846 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838849 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838850 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838851 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838853 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838853 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838839 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[167838839]: votes: 1, expected: 17 flags: 1 May 8 10:26:23 krusty corosync[30095]: [QUORUM] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No May 8 10:26:23 krusty corosync[30095]: [QUORUM] total_votes=8, expected_votes=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838839 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838841 state=2, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838842 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838844 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838846 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838849 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838850 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838851 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838853 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838839 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838842 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[167838842]: votes: 1, expected: 17 flags: 1 May 8 10:26:23 krusty corosync[30095]: [QUORUM] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No May 8 10:26:23 krusty corosync[30095]: [QUORUM] total_votes=8, expected_votes=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838839 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838841 state=2, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838842 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838844 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838846 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838849 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838850 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838851 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838853 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838842 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838844 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[167838844]: votes: 1, expected: 17 flags: 1 May 8 10:26:23 krusty corosync[30095]: [QUORUM] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No May 8 10:26:23 krusty corosync[30095]: [QUORUM] total_votes=8, expected_votes=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838839 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838841 state=2, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838842 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838844 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838846 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838849 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838850 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838851 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838853 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838844 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 May 8 10:26:23 krusty corosync[30095]: [QUORUM] got nodeinfo message from cluster node 167838846 May 8 10:26:23 krusty corosync[30095]: [QUORUM] nodeinfo message[167838846]: votes: 1, expected: 17 flags: 1 May 8 10:26:23 krusty corosync[30095]: [QUORUM] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No May 8 10:26:23 krusty corosync[30095]: [QUORUM] total_votes=8, expected_votes=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838839 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838841 state=2, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838842 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838844 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838846 state=1, votes=1, expected=17 May 8 10:26:23 krusty corosync[30095]: [QUORUM] node 167838849 state=1, votes=1, expected=17 May 8 10:26:23 krusty rsyslogd-2177: imuxsock begins to drop messages from pid 30095 due to rate-limiting May 8 10:26:23 krusty crmd[30689]: notice: pcmk_quorum_notification: Membership 47416: quorum lost (8) May 8 10:26:24 krusty crmd[30689]: notice: pcmk_quorum_notification: Membership 47420: quorum acquired (9) May 8 10:26:25 krusty crmd[30689]: notice: corosync_node_name: Unable to get node name for nodeid 167838841 May 8 10:27:37 krusty stonith-ng[30687]: notice: unpack_config: On loss of CCM Quorum: Ignore May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=quimby - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=todd - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=rod - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=maude - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=krusty - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=itchy - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=sideshowbob - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=willie - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=ned - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=moe - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=nelson - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=scratchy - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=fattony - this is rarely intended May 8 10:27:37 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=barney - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: notice: unpack_config: On loss of CCM Quorum: Ignore May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=quimby - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=todd - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=rod - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=maude - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=krusty - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=itchy - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=sideshowbob - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=willie - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=ned - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=moe - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=nelson - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=scratchy - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=fattony - this is rarely intended May 8 10:32:47 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=barney - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: notice: unpack_config: On loss of CCM Quorum: Ignore May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=quimby - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=todd - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=rod - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=maude - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=krusty - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=itchy - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=sideshowbob - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=willie - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=ned - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=moe - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=nelson - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=scratchy - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=fattony - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: create_node: Detected multiple node entries with uname=barney - this is rarely intended May 8 10:34:23 krusty stonith-ng[30687]: warning: unpack_rsc_location: No resource (con=cli-standby-res_IPaddr2_NXTProduction, rsc=res_IPaddr2_NXTProduction) May 8 10:34:23 krusty stonith-ng[30687]: warning: unpack_rsc_location: No resource (con=cli-prefer-res_IPaddr2_NXTProduction, rsc=res_IPaddr2_NXTProduction) May 8 10:34:46 krusty pacemakerd[30683]: error: cfg_connection_destroy: Connection destroyed May 8 10:34:46 krusty pacemakerd[30683]: notice: pcmk_shutdown_worker: Shuting down Pacemaker May 8 10:34:46 krusty pacemakerd[30683]: notice: stop_child: Stopping crmd: Sent -15 to process 30689 May 8 10:34:46 krusty crmd[30689]: notice: crm_shutdown: Requesting shutdown, upper limit is 1200000ms May 8 10:34:46 krusty pacemakerd[30683]: error: cpg_connection_destroy: Connection destroyed May 8 10:34:46 krusty stonith-ng[30687]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: 2 May 8 10:34:46 krusty stonith-ng[30687]: error: stonith_peer_ais_destroy: AIS connection terminated May 8 10:34:46 krusty crmd[30689]: error: crm_ipc_read: Connection to stonith-ng failed May 8 10:34:46 krusty crmd[30689]: error: mainloop_gio_callback: Connection to stonith-ng[0x1cce4e0] closed (I/O condition=17) May 8 10:34:46 krusty crmd[30689]: crit: tengine_stonith_connection_destroy: Fencing daemon connection failed May 8 10:34:46 krusty attrd[30688]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: 2 May 8 10:34:46 krusty attrd[30688]: crit: attrd_ais_destroy: Lost connection to Corosync service!
May  8 10:34:46 krusty attrd[30688]:   notice: main: Exiting...
May 8 10:34:46 krusty attrd[30688]: notice: main: Disconnecting client 0x22e9eb0, pid=30689... May 8 10:34:46 krusty attrd[30688]: error: attrd_cib_connection_destroy: Connection to the CIB terminated... May 8 10:34:46 krusty cib[30686]: error: pcmk_cpg_dispatch: Connection to the CPG API failed: 2 May 8 10:34:46 krusty cib[30686]: error: cib_ais_destroy: Corosync connection lost! Exiting. May 8 10:34:47 krusty crmd[30689]: error: te_connect_stonith: Sign-in failed: triggered a retry May 8 10:34:47 krusty crmd[30689]: error: crm_ipc_read: Connection to cib_shm failed May 8 10:34:47 krusty crmd[30689]: error: mainloop_gio_callback: Connection to cib_shm[0x1baf020] closed (I/O condition=17) May 8 10:34:47 krusty crmd[30689]: error: crmd_cib_connection_destroy: Connection to the CIB terminated... May 8 10:34:47 krusty crmd[30689]: error: crmd_quorum_destroy: connection terminated








_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux