Re: [PATCH 17/20] include: Define constants for parallel save/restore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/14/24 21:41, Jim Fehlig via Devel wrote:
> On 10/14/24 11:42, Daniel P. Berrangé wrote:
>> On Mon, Oct 14, 2024 at 06:00:53PM +0200, Claudio Fontana wrote:
>>> On 10/10/24 15:29, Daniel P. Berrangé wrote:
>>>> On Thu, Aug 08, 2024 at 05:38:10PM -0600, Jim Fehlig via Devel wrote:
>>>>> From: Claudio Fontana <cfontana@xxxxxxx>
>>>>>
>>>>> Add a new VIR_DOMAIN_SAVE_PARALLEL flag to the save and restore
>>>>> APIs, which can be used to specify the use of multiple, parallel
>>>>> channels for saving a domain. The number of parallel channels
>>>>> can be set using the VIR_DOMAIN_SAVE_PARAM_PARALLEL_CONNECTIONS
>>>>> typed parameter.
>>>>>
>>>>> Signed-off-by: Claudio Fontana <cfontana@xxxxxxx>
>>>>> Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx>
>>>>> ---
>>>>>   include/libvirt/libvirt-domain.h | 13 +++++++++++++
>>>>>   src/libvirt-domain.c             |  6 ++++++
>>>>>   2 files changed, 19 insertions(+)
>>>>>
>>>>> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
>>>>> index 4266237abe..f4bf9c3cdb 100644
>>>>> --- a/include/libvirt/libvirt-domain.h
>>>>> +++ b/include/libvirt/libvirt-domain.h
>>>>> @@ -1596,6 +1596,7 @@ typedef enum {
>>>>>       VIR_DOMAIN_SAVE_RUNNING      = 1 << 1, /* Favor running over paused (Since: 0.9.5) */
>>>>>       VIR_DOMAIN_SAVE_PAUSED       = 1 << 2, /* Favor paused over running (Since: 0.9.5) */
>>>>>       VIR_DOMAIN_SAVE_RESET_NVRAM  = 1 << 3, /* Re-initialize NVRAM from template (Since: 8.1.0) */
>>>>> +    VIR_DOMAIN_SAVE_PARALLEL     = 1 << 4, /* Save and restore using parallel channels (Since: 10.6.0) */
>>>>>   } virDomainSaveRestoreFlags;
>>>>>   
>>>>>   int                     virDomainSave           (virDomainPtr domain,
>>>>> @@ -1641,6 +1642,18 @@ int                     virDomainRestoreParams  (virConnectPtr conn,
>>>>>    */
>>>>>   # define VIR_DOMAIN_SAVE_PARAM_DXML             "dxml"
>>>>>   
>>>>> +/**
>>>>> + * VIR_DOMAIN_SAVE_PARAM_PARALLEL_CONNECTIONS:
>>>>> + *
>>>>> + * this optional parameter mirrors the migration parameter
>>>>> + * VIR_MIGRATE_PARAM_PARALLEL_CONNECTIONS.
>>>>> + *
>>>>> + * It specifies the number of extra connections to use when transfering state.
>>>>> + *
>>>>> + * Since: 10.6.0
>>>>> + */
>>>>> +# define VIR_DOMAIN_SAVE_PARAM_PARALLEL_CONNECTIONS     "parallel.connections"
>>>>
>>>> "CONNECTIONS" is perhaps the wrong name, as that's kinda an impl
>>>
>>> I would suggest CONNECTIONS it is the correct name.
>>>
>>>> detail. How about VIR_MIGRATE_PARAM_PARALLEL_CPUS to reflect
>>>> the number of CPUs we're going to burn time on.
>>>
>>> This is another aspect of it, but I would suggest that _this_ is much more an implementation detail (inside qemu) on how the parallel connection is actually managed.
>>> Maybe in the future there will be a pool of cpus serving this, or another optional way to serve the connections.
>>>
>>> Considering the fact that libvirt _already has_ for regular migrations
>>>
>>>      {.typedParam = VIR_MIGRATE_PARAM_PARALLEL_CONNECTIONS,
>>>       .param = QEMU_MIGRATION_PARAM_MULTIFD_CHANNELS,
>>>       .party = QEMU_MIGRATION_SOURCE | QEMU_MIGRATION_DESTINATION},
>>>
>>> which controls exactly the same QEMU parameter (ie the multifd-channels),
>>> using anything else than --parallel-connections is in my opinion very surprising for the users (which would know what to expect from the same command line option),
>>> and honestly simply absurd.
>>
>> This is comparing two different scenarios which work in varying ways.
>>
>> In the case of live migration there are network connections that

this is a fair point

>> we're controlling. So calling the parameter "parallel connections"
>> is contextually relevant, though we could equally have called
>> it "parallel vcpus" - depends whether you're thinking about multifd
>> as being there to max out TCP connections, or to max our local CPU
>> usage.
>>
>> In the case of migrating to a file though, there is no concept
>> of network connections at all. QEMU is opening a file and directly
>> writing to that file from multiple threads.  Calling the parameter
>> 'parallel connections' makes no sense in this context, as there
>> are no connections in use.  "parallel CPUs" reflects that we're
>> consuming multiple CPUs to write out data, though we could also
>> call it "parallel threads" for similar reasons.

here I disagree, because we don't know what QEMU will do to serve those multifd channels (threads, coroutines, or whatever construct might be implemented in the future to improve efficiency).

> 
> I can change this to PARALLEL_CPUS if really desired, but there's something 
> about it that doesn't feel right. It's hard to put to words, probably because 
> the reasons are mostly subjective. Claudio already raised the objective ones.
> 
> I do prefer PARALLEL_THREADS over PARALLEL_CPUS. How about reusing the name 
> already selected by QEMU and going with PARALLEL_CHANNELS?

Good idea, PARALLEL_CHANNELS seems to be the best to me.

> 
> Regards,
> Jim

Claudio




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux