Re: [RFC] Memory hotplug for qemu guests and the relevant XML parts

Peter Krempa <pkrempa@xxxxxxxxxx> · Tue, 29 Jul 2014 16:40:50 +0200

On 07/24/14 17:03, Peter Krempa wrote:
> On 07/24/14 16:40, Daniel P. Berrange wrote:
>> On Thu, Jul 24, 2014 at 04:30:43PM +0200, Peter Krempa wrote:
>>> On 07/24/14 16:21, Daniel P. Berrange wrote:
>>>> On Thu, Jul 24, 2014 at 02:20:22PM +0200, Peter Krempa wrote:

>>
>>>> So from that POV, I'd say that when we initially configure the
>>>> NUMA / huge page information for a guest at boot time, we should
>>>> be doing that wrt to the 'maxMemory' size, instead of the current
>>>> 'memory' size. ie the actual NUMA topology is all setup upfront
>>>> even though the DIMMS are not present for some of this topology.
>>>>
>>>>> "address" determines the address in the guest's memory space where the
>>>>> memory will be mapped. This is optional and not recommended being set by
>>>>> the user (except for special cases).
>>>>>
>>>>> For expansion the model="pflash" device may be added.
>>>>>
>>>>> For migration the target VM needs to be started with the hotplugged
>>>>> modules already specified on the command line, which is in line how we
>>>>> treat devices currently.
>>>>>
>>>>> My suggestion above contrasts with the approach Michal and Martin took
>>>>> when adding the numa and hugepage backing capabilities as they describe
>>>>> a node while this describes the memory device beneath it. I think those
>>>>> two approaches can co-exist whilst being mutually-exclusive. Simply when
>>>>> using memory hotplug, the memory will need to be specified using the
>>>>> memory modules. Non-hotplug guests could use the approach defined
>>>>> originally.
>>>>
>>>> I don't think it is viable to have two different approaches for configuring
>>>> NUMA / huge page information. Apps should not have to change the way they
>>>> configure NUMA/hugepages when they decide they want to take advantage of
>>>> DIMM hotplug.
>>>
>>> Well, the two approaches are orthogonal in the information they store.
>>> The existing approach stores the memory topology from the point of view
>>> of the numa node whereas the <device> based approach from the point of
>>> the memory module.
>>
>> Sure, they are clearly designed from different POV, but I'm saying that
>> from an application POV is it very unpleasant to have 2 different ways
>> to configure the same concept in the XML. So I really don't want us to
>> go down that route unless there is absolutely no other option to achieve
>> an acceptable level of functionality. If that really were the case, then
>> I would strongly consider reverting everything related to NUMA that we
>> have just done during this dev cycle and not releasing it as is.
>>
>>> The difference is that the existing approach currently wouldn't allow
>>> splitting a numa node into more memory devices to allow
>>> plugging/unplugging them.
>>
>> There's no reason why we have to assume 1 memory slot per guest or
>> per node when booting the guest. If the user wants the ability to
>> unplug, they could set their XML config so the guest has arbitrary
>> slot granularity. eg if i have a guest
>>
>>  - memory == 8 GB
>>  - max-memory == 16 GB
>>  - NUMA nodes == 4
>>
>> Then we could allow them to specify 32 memory slots each 512 MB
>> in size. This would allow them to plug/unplug memory from NUMA
>> nodes in 512 MB granularity.

In real hardware you still can plug in modules of different sizes. (eg
1GiB + 2Gib) ...

> 
> Well, while this makes it pretty close to real hardware, the emulated
> one doesn't have a problem with plugging "dimms" of weird
> (non-power-of-2) sizing. And we are loosing flexibility due to that.
> 

Hmm, now that the rest of the Hugepage stuff was pushed and the release
is rather soon. What approach should I take? I'd rather avoid crippling
the interface for memory hotplug and having to add separate apis and
other stuff and mostly I'd like to avoid having to re-do it after
consumers of libvirt deem it to be unflexible.

Peter

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list