On Thu, May 16, 2013 at 12:09:39PM -0400, Peter Feiner wrote: > Hello Daniel, > > I've been working on improving scalability in OpenStack on libvirt+kvm > for the last couple of months. I'm particularly interested in reducing > the time it takes to create VMs when many VMs are requested in > parallel. > > One apparent bottleneck during virtual machine creation is libvirt. As > more VMs are created in parallel, some libvirt calls (i.e., > virConnectGetLibVersion and virDomainCreateWithFlags) take longer > without a commensurate increase in hardware utilization. > > Thanks to your patches in libvirt-1.0.3, the situation has improved. > Some libvirt calls OpenStack makes during VM creation (i.e., > virConnectDefineXML) have no measurable slowdown when many VMs are > created in parallel. In turn, parallel VM creation in OpenStack is > significantly faster with libvirt-1.0.3. On my standard benchmark > (create 20 VMs in parallel, wait until the VM is ACTIVE, which is > essentially after virDomainCreateWithFlags returns), libvirt-1.0.3 > reduces the median creation time from 90s to 60s when compared to > libvirt-0.9.8. How many CPU cores are you testing on ? That's a good improvement, but I'd expect the improvement to be greater as # of core is larger. Also did you tune /etc/libvirt/libvirtd.conf at all ? By default we limit a single connection to only 5 RPC calls. Beyond that calls queue up, even if libvirtd is otherwise idle. OpenStack uses a single connection for everythin so will hit this. I suspect this would be why virConnectGetLibVersion would appear to be slow. That API does absolutely nothing of any consequence, so the only reason I'd expect that to be slow is if you're hitting a libvirtd RPC limit causing the API to be queued up. > I'd like to know if your concurrency work in the qemu driver is > ongoing. If it isn't, I'd like to pick the work up myself and work on > further improvements. Any advice or insight would be appreciated. I'm not actively doing anything in this area. Mostly because I've got not clear data on where any remaining bottlenecks are. One theory I had was that the virDomainObjListSearchName method could be a bottleneck, becaue that acquires a lock on every single VM. This is invoked when starting a VM, when we call virDomainObjListAddLocked. I tried removing this locking though & didn't see any performance benefit, so never persued this further. Before trying things like this again, I think we'd need to find a way to actually identify where the true bottlenecks are, rather than guesswork. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list