segmentation fault at pthread_join

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

My enviroment(corosync-1.4.5) encountered a segmentation fault at the following place.

(gdb) bt
#0  0x004f9012 in pthread_join () from /lib/libpthread.so.0
#1  0x00ba6956 in conn_info_destroy (fd=15, revent=17, context=0x8dd78a0)
    at coroipcs.c:503
#2  coroipcs_handler_dispatch (fd=15, revent=17, context=0x8dd78a0)
    at coroipcs.c:1617
#3  0x0804c63b in corosync_poll_handler_dispatch (
    handle=150346236434579456, fd=15, revent=17, context=0x8dd78a0)
    at main.c:1105
#4  0x00d7e994 in poll_run (handle=150346236434579456) at coropoll.c:513
#5  0x0804d697 in main (argc=2, argv=0xbfd7ad54, envp=0xbfd7ad60)
    at main.c:1874
(gdb) f 1
#1  0x00ba6956 in conn_info_destroy (fd=15, revent=17, context=0x8dd78a0)
    at coroipcs.c:503
503                     res = pthread_join (conn_info->thread, &retval);
(gdb) p conn_info->thread
$1 = 0

gdb shows that pthread_join tried to join an ipc consumer which does not exist. The reason I found out is that coroipcs_handler_dispatch() failed to create the thread and it did not check the return value of pthread_create() which was failed due to out of memory. When this happen, ipc client side saw ipc connection create successfully but all the subsequent ipc requests was blocked and never return. So I CTRL+C to quit the client application to close the ipc connection at the client side. At this time, server side calls pthread_join and got the segmentation fault.

The solution to the segmentation fault is simply checking if conn_info->thread is zero conn_info_destroy(), if it is,  then,we should omit to call pthread_join() and decrease ipc's refcount (which increased in coroipcs_handler_dispatch()).

So I changed the conn_info_destroy() code to the following:

        if (conn_info->state == CONN_STATE_THREAD_REQUEST_EXIT) {
                if (0 != conn_info->thread) {
                        res = pthread_join (conn_info->thread, &retval);
                } else {
                        coroipcs_refcount_dec (conn_info);
                }
                conn_info->state = CONN_STATE_THREAD_DESTROYED;
                return (0);
        }

 

But this solution is useless for the client ipc blocking problem, because when the above code returns 0 to coropoll.c, it will get no chance for coroipcs_handler_dispatch to be called again.

Any ideas?


--
Yours,
Jason

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux