Re: [libvirt RFCv8 00/27] multifd save restore prototype

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 11, 2022 at 09:26:10AM +0200, Claudio Fontana wrote:
> Hi Daniel,
> 
> thanks for looking at this,
> 
> On 5/10/22 8:38 PM, Daniel P. Berrangé wrote:
> > On Sat, May 07, 2022 at 03:42:53PM +0200, Claudio Fontana wrote:
> >> This is v8 of the multifd save prototype, which fixes a few bugs,
> >> adds a few more code splits, and records the number of channels
> >> as well as the compression algorithm, so the restore command is
> >> more user-friendly.
> >>
> >> It is now possible to just say:
> >>
> >> virsh save mydomain /mnt/saves/mysave --parallel
> >>
> >> virsh restore /mnt/saves/mysave --parallel
> >>
> >> and things work with the default of 2 channels, no compression.
> >>
> >> It is also possible to say of course:
> >>
> >> virsh save mydomain /mnt/saves/mysave --parallel
> >>       --parallel-connections 16 --parallel-compression zstd
> >>
> >> virsh restore /mnt/saves/mysave --parallel
> >>
> >> and things also work fine, due to channels and compression
> >> being stored in the main save file.
> > 
> > For the sake of people following along, the above commands will
> > result in creation of multiple files
> > 
> >   /mnt/saves/mysave
> >   /mnt/saves/mysave.0
> 
> just minor correction, there is no .0

Heh, off-by-1

> 
> >   /mnt/saves/mysave.1
> >   ....
> >   /mnt/saves/mysave.n
> > 
> > Where 'n' is the number of threads used.
> > 
> > Overall I'm not very happy with the approach of doing any of this
> > on the libvirt side.
> 
> 
> Ok I understand your concern.
> 
> > 
> > Backing up, we know that QEMU can directly save to disk faster than
> > libvirt can. We mitigated alot of that overhead with previous patches
> > to increase the pipe buffer size, but some still remains due to the
> > extra copies inherant in handing this off to libvirt.
> 
> Right;
> still the performance we get is insufficient for the use case we are trying to address,
> even without libvirt in the picture.
> 
> Instead, with parallel save + compression we can make the numbers add up.
> For parallel save using multifd, the overhead of libvirt is negligible.
> 
> > 
> > Using multifd on the libvirt side, IIUC, gets us better performance
> > than QEMU can manage if doing non-multifd write to file directly,
> > but we still have the extra copies in there due to the hand off
> > to libvirt. If QEMU were to be directly capable to writing to
> > disk with multifd, it should beat us again.
> 
> Hmm I am thinking about this point, and at first glance I don't
> think this is 100% accurate;
> 
> if we do parallel save like in this series with multifd,
> the overhead of libvirt is almost non-existent in my view
> compared with doing it with qemu only, skipping libvirt,
> it is limited to the one iohelper for the main channel
> (which is the smallest of the transfers),
> and maybe this could be removed as well.

Libvirt adds overhead due to the multiple data copies in
the save process. Using multifd doesn't get rid of this
overhead, it merely distributes the overhead across many
CPUs. The overall wallclock time is reduced but in aggregate
the CPUs still have the same amount of total work todo
copying data around.

I don't recall the scale of the libvirt overhead that remains
after the pipe buffer optimizations, but whatever is less is
still taking up host CPU time that can be used for other guests.

It also just ocurred to me that currently our save/restore
approach is bypassing all resource limits applied to the
guest. eg block I/O rate limits, CPU affinity controls,
etc, because most of the work is done in the iohelper.
If we had this done in QEMU, then the save/restore process
is confined by the existing CPU affinity / I/o limits
applied to the guest. This mean we would not negatively
impact other co-hosted guests to the same extent.

> This is because even without libvirt in the picture, we
> are still migrating to a socket, and something needs to
> transfer data from that socket to a file. At that point
> I think both libvirt and a custom made script are in the
> same position.

If QEMU had explicit support for a "file" backend, there
would be no socket involved at all. QEMU would be copying
guest RAM directly to a file with no intermediate steps.
If QEMU mmap'd the save state file, then saving of the
guest RAM could even possibly reduce to a mere 'memcpy()'

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux