Forwarding to corosync list -------- Original Message -------- Subject: Re: Corosync on NetBSD Date: Mon, 19 Mar 2012 14:44:57 +0100 From: Stephan <stephanwib@xxxxxxxxxxxxxx> To: netbsd-users@xxxxxxxxxx CC: tech-kern@xxxxxxxxxx, Steven Dake <sdake@xxxxxxxxxx> Hi all, I tried to run corosync again on NetBSD 6 BETA. The build errors are the same, and also it does nothing than utilizing 100% CPU. Luckily, gdb works in live mode with threads now, and shows this: (gdb) info th Id Target Id Frame 5 LWP 1 0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1 4 LWP 2 0x00007f7ff6476e5a in ___lwp_park50 () from /usr/lib/libc.so.12 3 LWP 3 0x00007f7ff643907a in poll () from /usr/lib/libc.so.12 2 LWP 4 0x00007f7ff6476e5a in ___lwp_park50 () from /usr/lib/libc.so.12 * 1 LWP 0 0x00007f7ff6476e5a in ___lwp_park50 () from /usr/lib/libc.so.12 (gdb) thr 5 [Switching to thread 5 (LWP 1)] #0 0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1 (gdb) bt #0 0x00007f7ff68071e0 in ?? () from /usr/lib/libpthread.so.1 #1 0x00007f7ff68075e8 in ?? () from /usr/lib/libpthread.so.1 #2 0x0000000000409947 in corosync_timer_add_duration (nanosec_duration=1500000000, data=0x0, timer_fn=0x4049b0 <corosync_totem_stats_updater>, handle=0x615518) at timer.c:221 #3 0x000000000040575c in corosync_totem_stats_init () at main.c:820 #4 main_service_ready () at main.c:1410 #5 0x00007f7ff781788b in main_iface_change_fn (context=0x7f7ff7b3c000, iface_addr=<optimized out>, iface_no=0) at totemsrp.c:4454 #6 0x00007f7ff7809473 in timer_function_netif_check_timeout (data=0x7f7ff7384000) at totemudp.c:1388 #7 0x00007f7ff7807780 in timerlist_expire (timerlist=0x7f7ff7b1b0d8) at tlist.h:309 #8 poll_run (handle=150346236434579456) at coropoll.c:526 #9 0x000000000040775a in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at main.c:1846 As you can see, it jumps somewhere in libpthread from the corosync_timer_add_duration() function, resulting in an infinite loop. As a hack, I just commented everything out: int corosync_timer_add_duration ( unsigned long long nanosec_duration, void *data, void (*timer_fn) (void *data), timer_handle *handle) { /* int res; int unlock; if (pthread_equal (pthread_self(), expiry_thread) != 0) { unlock = 0; } else { unlock = 1; pthread_mutex_lock (&timer_mutex); } res = timerlist_add_duration ( &timers_timerlist, timer_fn, data, nanosec_duration, handle); if (unlock) { pthread_mutex_unlock (&timer_mutex); } pthread_kill (expiry_thread, SIGUSR1); return (res); */ return 0; } Doing this, the corosync service successfully starts and interacts with its control tools. Does anybody have an idea what could be wrong with the code obove? Stephan _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss