Re: RFC: Improve performance of macvtap device creation

"Daniel P. Berrange" <berrange@xxxxxxxxxx> · Fri, 30 Oct 2015 21:38:27 +0900

On Fri, Oct 30, 2015 at 11:49:12AM +0100, Michal Privoznik wrote:
> On 29.10.2015 18:48, Laine Stump wrote:
> > On 10/29/2015 12:49 PM, Tony Krowiak wrote:
> >> For a guest domain defined with a large number of macvtap devices, it
> >> takes an exceedingly long time to boot the guest. In a test of a guest
> >> domain configured with 82 macvtap devices, it took over two minutes
> >> for the guest to boot. An strace of the ioctl calls during guest start
> >> up showed the SIOCGIFFLAGS ioctl literally being invoked 3,403 times.
> >> I was able to isolate the source of the ioctl calls to
> >> the*virNetDevMacVLanCreateWithVPortProfile*  function
> >> in*virnetdevmacvlan.c*. The macvtap interface name is created by
> >> looping over a counter variable, starting with zero, and appending the
> >> counter value to 'macvtap'.
> > 
> > I've wondered ever since the first time I saw that code why it was done
> > that way, and why there had never been any performance complaints.
> > Lacking any complaints, I promptly forgot about it (until the next time
> > I went past the code for some other tangentially related reason.)
> > 
> > Since you're the first to complain, you have the honor of fixing it :-)
> > 
> >> With each iteration, a call is made to*virNetDevExists*  (SIOCGIFFLAGS
> >> ioctl) to determine if a device with that name already exists, until a
> >> unique name is created. In the test case cited above, to create an
> >> interface name for the 82nd macvtap device, the*virNetDevExists* 
> >> function will be called for interface names 'macvtap0' to 'macvtap80'
> >> before it is determined that 'mavtap81' can be used. So if N is the
> >> number of macvtap interfaces defined for a guest, the SIOCGIFFLAGS
> >> ioctl will be invoked (N x N + N)/2 times to find an unused macvtap
> >> device names. That's assuming only one guest is being started, who
> >> knows how many times the ioctl may have to be called in an
> >> installation running a large number of guests defined with macvtap
> >> devices.
> 
> Not only that, but unitl c0d162c68c2f19af8d55a435a9e372da33857048 (
> contained v1.2.2~32) if two threads were starting a domain concurrently,
> they even competed with each other in that specific area of the code.
> 
> >>
> >> I was able to reduce the amount of time for starting a guest domain
> >> defined with 82 macvtap devices from over 2 minutes to about 14
> >> seconds by keeping track of the interface name suffixes previously
> >> used. I defined two static bit maps (virBitmap), one each for macvtap
> >> and macvlan device name suffixes. When a macvtap/macvlan device is
> >> created, the index of the next clear bit (virBitmapNextClearBit) is
> >> retrieved to create the name. If an interface with that name does not
> >> exist, the device is created and the bit at the index used to create
> >> the interface name is set (virBitmapSetBit). When a macvtap/macvlan
> >> device is deleted, if the interface name has the pattern 'macvtap%d'
> >> or 'macvlan%d', the suffix is parsed into a bit index and used to
> >> clear the (virBitMapClearBit) bit in the respective bitmap.
> > 
> > This sounds fine, as long as 1) you recreate the bitmap whenever
> > libvirtd is restarted (while scanning through all the interfaces of
> > every domain; there is already code being executed in exactly the right
> > place - look for qemu_process.c:qemuProcessNotifyNets() and add
> > appropriate code inside the loop there), and 2) you retry some number of
> > times if a supposedly unused device name is actually in use (to account
> > for processes other than libvirt using the same naming convention).
> 
> How about re-using the approach we have for virPortAllocator? We
> maintain a bitmap of ports. On acquiring new port, we try to bind() to
> it. If we succeeded, we set the corresponding bit in the bitmap. Of
> course it may happen that a port in the host is already taken but our
> bitmap does not think so. That's okay. We just leave the corresponding
> bit alone => if we would set it as used, nobody will ever unset it.
> Moreover, we will try the port next time, and it may be free.
> 
> Moreover, the bitmap is not saved anywhere, nor restored on daemon
> restart - this could be changed though.
> 
> So what am I saying is practically the same as Laine, just extending his
> thoughts and giving you an example how to proceed further :)

Yeah, I think maintaining a bitmap of used device indexes or names is
a fine idea for this.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list