I received a report of a virsh coredump that happened on a heavy loaded system. Debugging the corefile I saw that under certain circumstances the mutex for a connection can be destroyed while another thread is waiting for the lock on that connection. Specifically this seems to happen when vshDeinit is called after the completion of vshRunCommand and at the same time the event loop is receiving a client disconnect. vshDeinit calls virConnectClose which unrefs the connection object destroying the lock mutex before releasing the memory and nulling the callback pointers. At the same time the event loop will call remoteClientCloseFunc which waits for the mutex while the connection is unrefed. It will obtain the (now invalid but non-error checking) mutex, copy the closeCallback pointer (shorty before the connection object is destroyed) and will die upon trying to call closeFreeCallback (NULL by now). First I would like to see whether someone finds an obvious flaw in my reasoning. Then I'd would like to discuss how to get around it. One thought would be to increase the refcount of the connection object in remoteClientCloseFunc before calling the closeCallback. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list