On 24.07.2013 12:29, Daniel P. Berrange wrote: > On Wed, Jul 24, 2013 at 12:15:32PM +0200, Michal Privoznik wrote: >> There's a race in lxc driver causing a deadlock. If a domain is >> destroyed immediately after started, the deadlock can occur. When domain >> is started, the even loop tries to connect to the monitor. If the >> connecting succeeds, virLXCProcessMonitorInitNotify() is called with >> @mon->client locked. The first thing that callee does, is >> virObjectLock(vm). So the order of locking is: 1) @mon->client, 2) @vm. >> >> However, if there's another thread executing virDomainDestroy on the >> very same domain, the first thing done here is locking the @vm. Then, >> the corresponding libvirt_lxc process is killed and monitor is closed >> via calling virLXCMonitorClose(). This callee tries to lock @mon->client >> too. So the order is reversed to the first case. This situation results >> in deadlock and unresponsive libvirtd (since the eventloop is involved). >> >> The proper solution is to unlock the @vm in virLXCMonitorClose prior >> entering virNetClientClose(). See the backtrace as follows: > > Hmm, I think I'd say that the flaw is in the way virLXCProcessMonitorInitNotify > is invoked. In the QEMU driver monitor, we unlock the monitor before invoking > any callbacks. In the LXC driver monitor we're invoking the callbacks with > the monitor lock held. I think we need to make the LXC monitor locking wrt > callbacks do what QEMU does, and unlock the monitor. See QEMU_MONITOR_CALLBACK > in qemu_monitor.c > > > Daniel > I don't think so. It's not the monitor lock what is causing deadlock here. In fact, the monitor is unlocked: Thread 1 (Thread 0x7f35a348e740 (LWP 18839)): #0 0x00007f35a0481714 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f35a047d16c in _L_lock_516 () from /lib64/libpthread.so.0 #2 0x00007f35a047cfbb in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f35a29ab83f in virMutexLock (m=0x7f3588024e80) at util/virthreadpthread.c:85 #4 0x00007f35a2994d62 in virObjectLock (anyobj=0x7f3588024e70) at util/virobject.c:320 #5 0x00007f358ed5bbd7 in virLXCProcessMonitorInitNotify (mon=0x7f3560000ab0, initpid=29062, vm=0x7f3588024e70) at lxc/lxc_process.c:601 #6 0x00007f358ed59fd3 in virLXCMonitorHandleEventInit (prog=0x7f35600087b0, client=0x7f3560001fd0, evdata=0x7f35a53bc1e0, opaque=0x7f3560000ab0) at lxc/lxc_monitor.c:109 #7 0x00007f35a2ad2206 in virNetClientProgramDispatch (prog=0x7f35600087b0, client=0x7f3560001fd0, msg=0x7f3560002038) at rpc/virnetclientprogram.c:259 #8 0x00007f35a2acf0a0 in virNetClientCallDispatchMessage (client=0x7f3560001fd0) at rpc/virnetclient.c:1019 #9 0x00007f35a2acf72b in virNetClientCallDispatch (client=0x7f3560001fd0) at rpc/virnetclient.c:1140 #10 0x00007f35a2acfdb1 in virNetClientIOHandleInput (client=0x7f3560001fd0) at rpc/virnetclient.c:1312 #11 0x00007f35a2ad0fc1 in virNetClientIncomingEvent (sock=0x7f3560008350, events=1, opaque=0x7f3560001fd0) at rpc/virnetclient.c:1832 #12 0x00007f35a2ae6238 in virNetSocketEventHandle (watch=47, fd=40, events=1, opaque=0x7f3560008350) at rpc/virnetsocket.c:1695 #13 0x00007f35a296f33f in virEventPollDispatchHandles (nfds=22, fds=0x7f35a53bc7a0) at util/vireventpoll.c:498 #14 0x00007f35a296fb62 in virEventPollRunOnce () at util/vireventpoll.c:645 #15 0x00007f35a296dad1 in virEventRunDefaultImpl () at util/virevent.c:273 #16 0x00007f35a2ad69ee in virNetServerRun (srv=0x7f35a53b09d0) at rpc/virnetserver.c:1097 #17 0x00007f35a34e5b6b in main (argc=2, argv=0x7fffe188e778) at libvirtd.c:1512 (gdb) up #1 0x00007f35a047d16c in _L_lock_516 () from /lib64/libpthread.so.0 (gdb) up #2 0x00007f35a047cfbb in pthread_mutex_lock () from /lib64/libpthread.so.0 (gdb) #3 0x00007f35a29ab83f in virMutexLock (m=0x7f3588024e80) at util/virthreadpthread.c:85 85 pthread_mutex_lock(&m->lock); (gdb) #4 0x00007f35a2994d62 in virObjectLock (anyobj=0x7f3588024e70) at util/virobject.c:320 320 virMutexLock(&obj->lock); (gdb) #5 0x00007f358ed5bbd7 in virLXCProcessMonitorInitNotify (mon=0x7f3560000ab0, initpid=29062, vm=0x7f3588024e70) at lxc/lxc_process.c:601 601 virObjectLock(vm); (gdb) p *mon $1 = {parent = {parent = {magic = 3405643812, refs = 2, klass = 0x7f3588102cb0}, lock = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}}, vm = 0x7f3588024e70, cb = {destroy = 0x0, eofNotify = 0x0, exitNotify = 0x7f358ed5b928 <virLXCProcessMonitorExitNotify>, initNotify = 0x7f358ed5bb8b <virLXCProcessMonitorInitNotify>}, client = 0x7f3560001fd0, program = 0x7f35600087b0} (gdb) Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list