Re: Revisiting parallel save/restore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/26/24 4:04 AM, Daniel P. Berrangé wrote:
On Wed, Apr 17, 2024 at 05:12:27PM -0600, Jim Fehlig via Devel wrote:
A good starting point on this journey is supporting the new mapped-ram
capability in qemu 9.0 [2]. Since mapped-ram is a new on-disk format, I
assume we'll need a new QEMU_SAVE_VERSION 3 when using it? Otherwise I'm not
sure how to detect if a saved image is in mapped-ram format vs the existing,
sequential stream format.

Yes, we'll need to be supporting 'mapped-ram', so a good first step.

A question is whether we make that feature mandatory for all save images,
or implied by another feature (parallel save), or an directly controllable
feature with opt-in.

The former breaks back compat with existnig libvirt, while the latter 2
options are net new so don't have compat implications.

In terms of actual data blocks written on disk mapped-ram should be be the
same size, or smaller, than the existing format.

In terms of logical file size, however, mapped-ram will almost always be
larger.

This is because mapped-ram will result in a file whose logical size matches
the guest RAM size, plus some header overhead, while being sparse so not
all blocks are written.

If tools handling save images aren't sparse-aware this could come across
as a surprise and even be considered a regression.

Mapped ram is needed for parallel saves since it lets each thread write
to a specific region of the file.

Mapped ram is good for non-parallel saves too though, because the mapping
of RAM into the file is aligned suitably to allow for O_DIRECT to be used.
Currently libvirt has to tunnnel over its iohelper to futz alignment
needed for O_DIRECT. This makes it desirable to use in general, but back
compat hurts...


Looking at what we did in the past

First time, we stole a element from 'uint32_t unused[..]' in the
save header, to add the 'compressed' field, and bumped the
version. This prevented old libvirt reading the files. This was
needed as adding compression was a non-backwards compatible
change. We could have carried on using version 1 for non-compressd
fields, but we didn't for some reason. It was a hard compat break.

Hmm, libvirt's implementation of compression seems to conflict with mapped-ram. AFAIK, mapped-ram requires a seekable fd. Should the two be mutually exclusive?


Next time, we stole a element from 'uint32 unused[..]' in the
save header, to add the 'cookie_len' field, but did NOT bump
the version. 'unused' is always all zeroes, so new libvirt could
detect whether the cookie was present by the len being non-zero.
Old libvirt would still load the image, but would be ignoring
the cookie data. This was largely harmless.

This time mapped-ram is a non-compatible change, so we need to
ensure old libvirt won't try to read the files, which suggests
either a save version bump, or we could abuse the 'compressed'
field to indicate 'mapped-ram' as a form of compression.

If we did a save version bump, we might want to carrry on using
v2 for non mapped ram.

IIUC, mapped-ram cannot be used with the exiting 'fd:' migration URI and
instead must use 'file:'. Does qemu advertise support for that? I couldn't
find it. If not, 'file:' (available in qemu 8.2) predates mapped-ram, so in
theory we could live without the advertisement.

'mapped-ram' is reported in QMP as a MigrationCapability, so I think we
can probe for it directly.

Yes, it is exclusively for use with 'file:' protocol. If we want to use
FD passing, then we can still do that with 'file:', by using QEMU's
generic /dev/fdset/NNN approach we have with block devices.


It's also not clear when we want to enable the mapped-ram capability. Should
it always be enabled if supported by the underlying qemu? One motivation for
creating the mapped-ram was to support direct-io of the migration stream in
qemu, in which case it could be tied to VIR_DOMAIN_SAVE_BYPASS_CACHE. E.g.
the mapped-ram capability is enabled when user specifies
VIR_DOMAIN_SAVE_BYPASS_CACHE && user-provided path results in a seekable fd
&& qemu supports mapped-ram?

One option is to be lazy and have a /etc/libvirt/qemu.conf for the
save format version, defaulting to latest v3. Release note that
admin/host provisioning apps must set it to v2 if back compat is
needed with old libvirt. If we assume new -> old save image loading
is relatively rare, that's probably good enough.

IOW, we can

  * Bump save version to 3
  * Use v3 by default

Using mapped-ram by default but not supporting compression would be a regression, right? E.g. 'virsh save vm-name /some/path' would suddenly fail if user's /etc/libvirt/qemu.conf contained 'save_image_format = "lzop"'.

Regards,
Jim

  * Add a SAVE_PARALLEL flag which implies mapped-ram, reject
    if v2
  * Use mapped RAM with BYPASS_CACHE for v3, old approach for v2
  * Steal another unused field to indicate use of mapped-ram,
    or perhaps future proof it by declaring a 'features'
    field. So we don't need to bump version again, just make
    sure that the libvirt loading an image supports all
    set features.

Looking ahead, should the mapped-ram capability be required for supporting
the VIR_DOMAIN_SAVE_PARALLEL flag? As I understand, parallel save/restore
was another motivation for creating the mapped-ram feature. It allows
multifd threads to write exclusively to the offsets provided by mapped-ram.
Can multiple multifd threads concurrently write to an fd without mapped-ram?

Yes, mapped-ram should be a pre-requisite.


With regards,
Daniel
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux