>> > > One theory I had was that the virDomainObjListSearchName method could >> > > be a bottleneck, becaue that acquires a lock on every single VM. This >> > > is invoked when starting a VM, when we call virDomainObjListAddLocked. >> > > I tried removing this locking though & didn't see any performance >> > > benefit, so never persued this further. Before trying things like >> > > this again, I think we'd need to find a way to actually identify where >> > > the true bottlenecks are, rather than guesswork. ... > Oh someone has already written such a systemtap script > > http://sourceware.org/systemtap/examples/process/mutex-contention.stp > > I think that is preferrable to trying to embed special code in > libvirt for this task. > > Daniel > -- > |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| Cool! The systemtap approach was very fruitful. BTW, at the time of writing, the example script has a bug. See http://sourceware.org/ml/systemtap/2013-q2/msg00169.html for the fix. So the root cause of my bottleneck is the virSecurityManager lock. >From this root cause a few other bottlenecks emerge. The interesting parts of the mutex-contention.stp report are pasted at the end of this email. Here's the summary & my analysis: When a domain is created (domainCreateWithFlags), the domain object's lock is held. During the domain creation, various virSecurity functions are called, which all grab the security manager's lock. Since the security manager's lock is global, some fraction of domainCreateWithFlags is serialized by this lock. Since some virSecurity functions can take a long time, such as virSecurityManagerGenLabel for the apparmor security driver, which takes around 1s, the serialization that the security manager lock induces in domainCreateWithFlags is substantial. Since the domain's object lock is held all of this time, virDomainObjListSearchName blocks, thereby serializing virConnectDefineXML via virDomainObjListAdd, as you suggested earlier. Moreover, since the virDomainObjList lock is held while blocking in virDomainObjListSearchName, there's measurable contention whilst looking up domains during domainCreateWithFlags. Since some security driver operations are costly, I think it's worthwhile to reduce the scope of the security manager lock or increase the granularity by introducing more locks. After a cursory look, the security manager lock seems to have a much broader scope than necessary. The system / library calls underlying the security drivers are all thread safe (e.g., defining apparmor security profiles or chowning disk files), so a global lock isn't strictly necessary. Moreover, since most virSecurity calls are made whilst a virDomainObj lock is held and the security calls are generally domain specific, *most* of the security calls are probably thread safe in the absence of the global security manager lock. Obviously some work will have to be done to see where the security lock actually matters and some finer-grained locks will have to be introduced to handle these situations. I also think it's worthwhile to eliminate locking from the the virDomainObjList lookups and traversals. Since virDomainObjLists are accessed in a bunch of places, I think it's a good defensive idea to decouple the performance of these accesses from virDomainObj locks, which are held during potentially long-running operations like domain creation. An easy way to divorce virDomainObjListSearchName from the virDomainObj lock would be to keep a copy of the domain names in the virDomainObjList and protect that list with the virDomainObjList lock. What do you think? Peter ============== stack contended 4 times, 261325 avg usec, 576521 max usec, 1045301 total usec, at __lll_lock_wait+0x1c [libpthread-2.15.so] _L_lock_858+0xf [libpthread-2.15.so] __pthread_mutex_lock+0x3a [libpthread-2.15.so] virDomainObjListFindByUUID+0x21 [libvirt.so.0.1000.4] qemuDomainGetXMLDesc+0x48 [libvirt_driver_qemu.so] virDomainGetXMLDesc+0xf5 [libvirt.so.0.1000.4] remoteDispatchDomainGetXMLDescHelper+0xb6 [libvirtd] virNetServerProgramDispatch+0x498 [libvirt.so.0.1000.4] virNetServerProcessMsg+0x2a [libvirt.so.0.1000.4] virNetServerHandleJob+0x73 [libvirt.so.0.1000.4] virThreadPoolWorker+0x10e ============== stack contended 12 times, 128053 avg usec, 992567 max usec, 1536640 total usec, at __lll_lock_wait+0x1c [libpthread-2.15.so] _L_lock_858+0xf [libpthread-2.15.so] __pthread_mutex_lock+0x3a [libpthread-2.15.so] virDomainObjListFindByUUID+0x21 [libvirt.so.0.1000.4] qemuDomainStartWithFlags+0x5a [libvirt_driver_qemu.so] virDomainCreateWithFlags+0xf5 [libvirt.so.0.1000.4] remoteDispatchDomainCreateWithFlagsHelper+0xbe [libvirtd] virNetServerProgramDispatch+0x498 [libvirt.so.0.1000.4] virNetServerProcessMsg+0x2a [libvirt.so.0.1000.4] virNetServerHandleJob+0x73 [libvirt.so.0.1000.4] virThreadPo ============== stack contended 24 times, 289502 avg usec, 3441079 max usec, 6948070 total usec, at __lll_lock_wait+0x1c [libpthread-2.15.so] _L_lock_858+0xf [libpthread-2.15.so] __pthread_mutex_lock+0x3a [libpthread-2.15.so] virDomainObjListSearchName+0x19 [libvirt.so.0.1000.4] virHashSearch+0x65 [libvirt.so.0.1000.4] virDomainObjListAddLocked.isra.37+0x1f4 [libvirt.so.0.1000.4] virDomainObjListAdd+0x45 [libvirt.so.0.1000.4] qemuDomainDefine+0xfa [libvirt_driver_qemu.so] virDomainDefineXML+0x8b [libvirt.so.0.1000.4] remoteDispatchDomainDefineXMLHelper+0x84 [libvirtd] virNetServerProgramDispatch+0x498 [l ============== stack contended 15 times, 3454756 avg usec, 5041310 max usec, 51821341 total usec, at __lll_lock_wait+0x1c [libpthread-2.15.so] _L_lock_858+0xf [libpthread-2.15.so] __pthread_mutex_lock+0x3a [libpthread-2.15.so] virDomainObjListAdd+0x2e [libvirt.so.0.1000.4] qemuDomainDefine+0xfa [libvirt_driver_qemu.so] virDomainDefineXML+0x8b [libvirt.so.0.1000.4] remoteDispatchDomainDefineXMLHelper+0x84 [libvirtd] virNetServerProgramDispatch+0x498 [libvirt.so.0.1000.4] virNetServerProcessMsg+0x2a [libvirt.so.0.1000.4] virNetServerHandleJob+0x73 [libvirt.so.0.1000.4] virThreadPoolWorker+0x10e [libvirt.so. ============== stack contended 6 times, 3291567 avg usec, 7098148 max usec, 19749405 total usec, at __lll_lock_wait+0x1c [libpthread-2.15.so] _L_lock_858+0xf [libpthread-2.15.so] __pthread_mutex_lock+0x3a [libpthread-2.15.so] qemuProcessStart+0x1014 [libvirt_driver_qemu.so] qemuDomainObjStart+0x24e [libvirt_driver_qemu.so] qemuDomainStartWithFlags+0x152 [libvirt_driver_qemu.so] virDomainCreateWithFlags+0xf5 [libvirt.so.0.1000.4] remoteDispatchDomainCreateWithFlagsHelper+0xbe [libvirtd] virNetServerProgramDispatch+0x498 [libvirt.so.0.1000.4] virNetServerProcessMsg+0x2a [libvirt.so.0.1000.4] virNetServerHa ============== stack contended 16 times, 3708683 avg usec, 7936168 max usec, 59338928 total usec, at __lll_lock_wait+0x1c [libpthread-2.15.so] _L_lock_858+0xf [libpthread-2.15.so] __pthread_mutex_lock+0x3a [libpthread-2.15.so] virSecurityManagerGenLabel+0x50 [libvirt.so.0.1000.4] qemuProcessStart+0x297 [libvirt_driver_qemu.so] qemuDomainObjStart+0x24e [libvirt_driver_qemu.so] qemuDomainStartWithFlags+0x152 [libvirt_driver_qemu.so] virDomainCreateWithFlags+0xf5 [libvirt.so.0.1000.4] remoteDispatchDomainCreateWithFlagsHelper+0xbe [libvirtd] virNetServerProgramDispatch+0x498 [libvirt.so.0.1000.4] virNetServe -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list