Re: segmentation fault at pthread_join

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jason,
again thanks for your very detailed analysis. What is pthread_create
returning? EAGAIN? If so, I would probably go way "Instead of solving
problem, just don't allow problem to appear". In other words, I would
check res = pthread_create, and if it's EAGAIN, just clear conn_info
variables like ref_count, state, private_data, ... and call
ipc_disconnect. What do you think about that?

Honza

jason napsal(a):
> Hi All,
> 
> My enviroment(corosync-1.4.5) encountered a segmentation fault at the
> following place.
> 
> (gdb) bt
> #0  0x004f9012 in pthread_join () from /lib/libpthread.so.0
> #1  0x00ba6956 in conn_info_destroy (fd=15, revent=17, context=0x8dd78a0)
>     at coroipcs.c:503
> #2  coroipcs_handler_dispatch (fd=15, revent=17, context=0x8dd78a0)
>     at coroipcs.c:1617
> #3  0x0804c63b in corosync_poll_handler_dispatch (
>     handle=150346236434579456, fd=15, revent=17, context=0x8dd78a0)
>     at main.c:1105
> #4  0x00d7e994 in poll_run (handle=150346236434579456) at coropoll.c:513
> #5  0x0804d697 in main (argc=2, argv=0xbfd7ad54, envp=0xbfd7ad60)
>     at main.c:1874
> (gdb) f 1
> #1  0x00ba6956 in conn_info_destroy (fd=15, revent=17, context=0x8dd78a0)
>     at coroipcs.c:503
> 503                     res = pthread_join (conn_info->thread, &retval);
> (gdb) p conn_info->thread
> $1 = 0
> 
> gdb shows that pthread_join tried to join an ipc consumer which does not
> exist. The reason I found out is that coroipcs_handler_dispatch() failed to
> create the thread and it did not check the return value of pthread_create()
> which was failed due to out of memory. When this happen, ipc client side
> saw ipc connection create successfully but all the subsequent ipc requests
> was blocked and never return. So I CTRL+C to quit the client application to
> close the ipc connection at the client side. At this time, server side
> calls pthread_join and got the segmentation fault.
> 
> The solution to the segmentation fault is simply checking if
> conn_info->thread is zero conn_info_destroy(), if it is,  then,we should
> omit to call pthread_join() and decrease ipc's refcount (which increased in
> coroipcs_handler_dispatch()).
> 
> So I changed the conn_info_destroy() code to the following:
> 
>         if (conn_info->state == CONN_STATE_THREAD_REQUEST_EXIT) {
>                 if (0 != conn_info->thread) {
>                         res = pthread_join (conn_info->thread, &retval);
>                 } else {
>                         coroipcs_refcount_dec (conn_info);
>                 }
>                 conn_info->state = CONN_STATE_THREAD_DESTROYED;
>                 return (0);
>         }
> 
> 
> 
> But this solution is useless for the client ipc blocking problem, because
> when the above code returns 0 to coropoll.c, it will get no chance for
> coroipcs_handler_dispatch to be called again.
> 
> Any ideas?
> 
> 
> 
> 
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss
> 

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux