Re: coroipcs_ipc_service_exit() dead loop

jason <huzhijiang@xxxxxxxxx> · Mon, 22 Apr 2013 23:01:56 +0800

Sorry, in the previous mail, I didn't realize that after service_exit_schedwrk_handler() for confdb is done, the notify_pipe was closed, therefore, ipc_dispatch_send_from_poll_thread() won't increase conn->refcount.  But if below senario exists, dead loop still have chance to happen:

1. confdb_notify_lib_of_key_change()/confdb_notify_lib_of_new_object()/... 
( before objdb_notify_dispatch() )
2. service_exit_schedwrk_handler()
3. service_unlink_schedwrk_handler() //deadloop!

On Mon, Apr 22, 2013 at 10:29 PM, jason <huzhijiang@xxxxxxxxx> wrote:

Hi All,
I encountered a dead looping at the following code:

coroipcs_ipc_service_exit() {
	...
		while (conn_info_destroy (conn_info) != -1)
			;
} 

It happend when confdb service side was notifying library side about key changing(or object creating/destroying) while corosync is unloading. When it happend, i saw conn_info->refcount =3, and it was a confdb IPC connection.

By analysing the code I found that there is a gap between service_exit_schedwrk_handler() and service_unlink_schedwrk_handler(), and if confdb service side calls confdb_notify_lib_of_key_change() in this gap (triggered by some other service), the conn_info->refcount will be increased by ipc_dispatch_send_from_poll_thread(). Then, when we are in coroipcs_ipc_service_exit(), dead loop will happen.

And more, after service_exit_schedwrk_handler() for confdb is done, objdb_notify_dispatch() is unregistered from poll, thus, there is no more chance to decrease conn->refcount after this(even we somehow omit the dead loop). 

Above is my conclusion only by code analysis. I haven't got any idea to correct it , even not sure if it is the root cause of the dead loop. Please help.

-- 

Yours,
Jason

-- 
Yours,
Jason

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss