This series is essentially V1 of a prior RFC [1] to support QEMU's mapped-ram stream format [2] and migration capability. Along with supporting mapped-ram, it implements a design approach we discussed for supporting parallel save/restore [3]. In summary, the approach is 1. Add mapped-ram migration capability 2. Steal an element from save header 'unused' for a 'features' variable and bump save version to 3. 3. Add /etc/libvirt/qemu.conf knob for the save format version, defaulting to latest v3 4. Use v3 (aka mapped-ram) by default 5. Use mapped-ram with BYPASS_CACHE for v3, old approach for v2 6. include: Define constants for parallel save/restore 7. qemu: Add support for parallel save. Implies mapped-ram, reject if v2 8. qemu: Add support for parallel restore. Implies mapped-ram. Reject if v2 9. tools: add parallel parameter to virsh save command 10. tools: add parallel parameter to virsh restore command With this series, saving and restoring using mapped-ram is enabled by default if the underlying QEMU advertises the mapped-ram migration capability. It can be disabled by changing the 'save_image_version' setting in qemu.conf. To use mapped-ram with QEMU: - The 'mapped-ram' migration capability must be set to true - The 'multifd' migration capability must be set to true and the 'multifd-channels' migration parameter must set to a value >= 1 - QEMU must be provided an fdset containing the migration fd(s) - The 'migrate' qmp command is invoked with a URI referencing the fdset and an offset where to start reading or writing the data stream, e.g. {"execute":"migrate", "arguments":{"detach":true,"resume":false, "uri":"file:/dev/fdset/0,offset=0x11921"}} The mapped-ram stream, in conjunction with direct IO and multifd, can significantly improve the time required to save VM memory state. The following tables compare mapped-ram with the existing, sequential save stream. In all cases, the save and restore operations are to/from a block device comprised of two NVMe disks in RAID0 configuration with xfs (~8600MiB/s). The values in the 'save time' and 'restore time' columns were scraped from the 'real' time reported by time(1). The 'Size' and 'Blocks' columns were provided by the corresponding outputs of stat(1). VM: 32G RAM, 1 vcpu, idle (shortly after boot) | save | restore | | time | time | Size | Blocks -----------------------+---------+---------+--------------+-------- legacy | 6.193s | 4.399s | 985744812 | 1925288 -----------------------+---------+---------+--------------+-------- mapped-ram | 5.109s | 1.176s | 34368554354 | 1774472 -----------------------+---------+---------+--------------+-------- legacy + direct IO | 5.725s | 4.512s | 985765251 | 1925328 -----------------------+---------+---------+--------------+-------- mapped-ram + direct IO | 4.627s | 1.490s | 34368554354 | 1774304 -----------------------+---------+---------+--------------+-------- mapped-ram + direct IO | | | | + multifd-channels=8 | 4.421s | 0.845s | 34368554318 | 1774312 ------------------------------------------------------------------- VM: 32G RAM, 30G dirty, 1 vcpu in tight loop dirtying memory | save | restore | | time | time | Size | Blocks -----------------------+---------+---------+--------------+--------- legacy | 25.800s | 14.332s | 33154309983 | 64754512 -----------------------+---------+---------+--------------+--------- mapped-ram | 18.742s | 15.027s | 34368559228 | 64617160 -----------------------+---------+---------+--------------+--------- legacy + direct IO | 13.115s | 18.050s | 33154310496 | 64754520 -----------------------+---------+---------+--------------+--------- mapped-ram + direct IO | 13.623s | 15.959s | 34368557392 | 64662040 -----------------------+-------- +---------+--------------+--------- mapped-ram + direct IO | | | | + multifd-channels=8 | 6.994s | 6.470s | 34368554980 | 64665776 -------------------------------------------------------------------- As can be seen from the tables, one caveat of mapped-ram is the logical file size of a saved image is basically equivalent to the VM memory size. Note however that mapped-ram typically uses fewer blocks on disk. Support for mapped-ram+direct-io only recently landed in upstream QEMU and will first appear in the 9.1 release, which may complicate merging support in libvirt. Specifically, I'm not sure how to detect if the combination is supported by QEMU. Suggestions welcomed. Similar to the RFC, V1 ignores compression. libvirt currently supports compression by connecting the output of QEMU's save stream to the specified compression program via a pipe. This approach is incompatible with mapped-ram since the fd provided to QEMU must be seekable. In general, we can consider mapped-ram and compression incompatible and document they cannot be used together. [1] https://lists.libvirt.org/archives/list/devel@xxxxxxxxxxxxxxxxx/message/EF6YS5YIPYF2JXFMSKP6OLEJ2XWXJ3XW/ [2] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/mapped-ram.rst?ref_type=heads [3] https://lists.libvirt.org/archives/list/devel@xxxxxxxxxxxxxxxxx/message/K4BDDJDMJ22XMJEFAUE323H5S5E47VQX/ Claudio Fontana (2): include: Define constants for parallel save/restore tools: add parallel parameter to virsh restore command Jim Fehlig (17): lib: virDomainSaveParams: Ensure absolute save path qemu_fd: Add function to retrieve fdset ID qemu: Add function to check capability in migration params qemu: Add function to get bool value from migration params qemu: Add mapped-ram migration capability qemu: Add function to get migration params for save qemu: QEMU_SAVE_VERSION: Bump to version 3 qemu: conf: Add setting for save image version qemu: Add helper function for creating save image fd qemu: Add support for mapped-ram on save qemu: Decompose qemuSaveImageOpen qemu: Move creation of qemuProcessIncomingDef struct qemu: Apply migration parameters in qemuMigrationDstRun qemu: Add support for mapped-ram on restore qemu: Support O_DIRECT with mapped-ram on save qemu: Support O_DIRECT with mapped-ram on restore qemu: Add support for parallel save and restore Li Zhang (1): tools: add parallel parameter to virsh save command docs/manpages/virsh.rst | 9 +- include/libvirt/libvirt-domain.h | 13 ++ src/libvirt-domain.c | 52 +++++-- src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf.in | 6 + src/qemu/qemu_conf.c | 16 +++ src/qemu/qemu_conf.h | 5 + src/qemu/qemu_driver.c | 104 +++++++++----- src/qemu/qemu_fd.c | 18 +++ src/qemu/qemu_fd.h | 3 + src/qemu/qemu_migration.c | 192 +++++++++++++++++-------- src/qemu/qemu_migration.h | 9 +- src/qemu/qemu_migration_params.c | 86 ++++++++++++ src/qemu/qemu_migration_params.h | 17 +++ src/qemu/qemu_monitor.c | 39 ++++++ src/qemu/qemu_monitor.h | 5 + src/qemu/qemu_process.c | 120 +++++++++++----- src/qemu/qemu_process.h | 19 ++- src/qemu/qemu_saveimage.c | 216 ++++++++++++++++++++--------- src/qemu/qemu_saveimage.h | 35 +++-- src/qemu/qemu_snapshot.c | 26 ++-- src/qemu/test_libvirtd_qemu.aug.in | 1 + tools/virsh-domain.c | 79 +++++++++-- 23 files changed, 827 insertions(+), 244 deletions(-) -- 2.35.3