On 10/14/24 21:41, Jim Fehlig via Devel wrote: > On 10/14/24 11:42, Daniel P. Berrangé wrote: >> On Mon, Oct 14, 2024 at 06:00:53PM +0200, Claudio Fontana wrote: >>> On 10/10/24 15:29, Daniel P. Berrangé wrote: >>>> On Thu, Aug 08, 2024 at 05:38:10PM -0600, Jim Fehlig via Devel wrote: >>>>> From: Claudio Fontana <cfontana@xxxxxxx> >>>>> >>>>> Add a new VIR_DOMAIN_SAVE_PARALLEL flag to the save and restore >>>>> APIs, which can be used to specify the use of multiple, parallel >>>>> channels for saving a domain. The number of parallel channels >>>>> can be set using the VIR_DOMAIN_SAVE_PARAM_PARALLEL_CONNECTIONS >>>>> typed parameter. >>>>> >>>>> Signed-off-by: Claudio Fontana <cfontana@xxxxxxx> >>>>> Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx> >>>>> --- >>>>> include/libvirt/libvirt-domain.h | 13 +++++++++++++ >>>>> src/libvirt-domain.c | 6 ++++++ >>>>> 2 files changed, 19 insertions(+) >>>>> >>>>> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h >>>>> index 4266237abe..f4bf9c3cdb 100644 >>>>> --- a/include/libvirt/libvirt-domain.h >>>>> +++ b/include/libvirt/libvirt-domain.h >>>>> @@ -1596,6 +1596,7 @@ typedef enum { >>>>> VIR_DOMAIN_SAVE_RUNNING = 1 << 1, /* Favor running over paused (Since: 0.9.5) */ >>>>> VIR_DOMAIN_SAVE_PAUSED = 1 << 2, /* Favor paused over running (Since: 0.9.5) */ >>>>> VIR_DOMAIN_SAVE_RESET_NVRAM = 1 << 3, /* Re-initialize NVRAM from template (Since: 8.1.0) */ >>>>> + VIR_DOMAIN_SAVE_PARALLEL = 1 << 4, /* Save and restore using parallel channels (Since: 10.6.0) */ >>>>> } virDomainSaveRestoreFlags; >>>>> >>>>> int virDomainSave (virDomainPtr domain, >>>>> @@ -1641,6 +1642,18 @@ int virDomainRestoreParams (virConnectPtr conn, >>>>> */ >>>>> # define VIR_DOMAIN_SAVE_PARAM_DXML "dxml" >>>>> >>>>> +/** >>>>> + * VIR_DOMAIN_SAVE_PARAM_PARALLEL_CONNECTIONS: >>>>> + * >>>>> + * this optional parameter mirrors the migration parameter >>>>> + * VIR_MIGRATE_PARAM_PARALLEL_CONNECTIONS. >>>>> + * >>>>> + * It specifies the number of extra connections to use when transfering state. >>>>> + * >>>>> + * Since: 10.6.0 >>>>> + */ >>>>> +# define VIR_DOMAIN_SAVE_PARAM_PARALLEL_CONNECTIONS "parallel.connections" >>>> >>>> "CONNECTIONS" is perhaps the wrong name, as that's kinda an impl >>> >>> I would suggest CONNECTIONS it is the correct name. >>> >>>> detail. How about VIR_MIGRATE_PARAM_PARALLEL_CPUS to reflect >>>> the number of CPUs we're going to burn time on. >>> >>> This is another aspect of it, but I would suggest that _this_ is much more an implementation detail (inside qemu) on how the parallel connection is actually managed. >>> Maybe in the future there will be a pool of cpus serving this, or another optional way to serve the connections. >>> >>> Considering the fact that libvirt _already has_ for regular migrations >>> >>> {.typedParam = VIR_MIGRATE_PARAM_PARALLEL_CONNECTIONS, >>> .param = QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS, >>> .party = QEMU_MIGRATION_SOURCE | QEMU_MIGRATION_DESTINATION}, >>> >>> which controls exactly the same QEMU parameter (ie the multifd-channels), >>> using anything else than --parallel-connections is in my opinion very surprising for the users (which would know what to expect from the same command line option), >>> and honestly simply absurd. >> >> This is comparing two different scenarios which work in varying ways. >> >> In the case of live migration there are network connections that this is a fair point >> we're controlling. So calling the parameter "parallel connections" >> is contextually relevant, though we could equally have called >> it "parallel vcpus" - depends whether you're thinking about multifd >> as being there to max out TCP connections, or to max our local CPU >> usage. >> >> In the case of migrating to a file though, there is no concept >> of network connections at all. QEMU is opening a file and directly >> writing to that file from multiple threads. Calling the parameter >> 'parallel connections' makes no sense in this context, as there >> are no connections in use. "parallel CPUs" reflects that we're >> consuming multiple CPUs to write out data, though we could also >> call it "parallel threads" for similar reasons. here I disagree, because we don't know what QEMU will do to serve those multifd channels (threads, coroutines, or whatever construct might be implemented in the future to improve efficiency). > > I can change this to PARALLEL_CPUS if really desired, but there's something > about it that doesn't feel right. It's hard to put to words, probably because > the reasons are mostly subjective. Claudio already raised the objective ones. > > I do prefer PARALLEL_THREADS over PARALLEL_CPUS. How about reusing the name > already selected by QEMU and going with PARALLEL_CHANNELS? Good idea, PARALLEL_CHANNELS seems to be the best to me. > > Regards, > Jim Claudio