Anthony Liguori wrote:
On 08/23/2010 01:59 PM, Marcelo Tosatti wrote:
On Wed, Aug 11, 2010 at 03:52:18PM +0200, Andre Przywara wrote:
According to the user-provided assignment bind the respective part
of the guest's memory to the given host node. This uses Linux'
mbind syscall (which is wrapped only in libnuma) to realize the
pinning right after the allocation.
Failures are not fatal, but produce a warning.
Signed-off-by: Andre Przywara<andre.przywara@xxxxxxx>
>>> ...
Why is it not possible (or perhaps not desired) to change the binding
after the guest is started?
Sounds unflexible.
The solution is to introduce a monitor interface to later adjust the
pinning, allowing both changing the affinity only (only valid for future
fault-ins) and actually copying the memory (more costly).
Actually this is the next item on my list, but I wanted to bring up the
basics first to avoid recoding parts afterwards. Also I am not (yet)
familiar with the QMP protocol.
We really need a solution that lets a user use a tool like numactl
outside of the QEMU instance.
I fear that is not how it's meant to work with the Linux' NUMA API. In
opposite to the VCPU threads, which are externally visible entities
(PIDs), the memory should be private to the QEMU process. While you can
change the NUMA allocation policy of the _whole_ process, there is no
way to externally distinguish parts of the process' memory. Although you
could later (and externally) migrate already faulted pages (via
move_pages(2) and by looking in /proc/$$/numa_maps), you would let an
external tool interfere with QEMUs internal memory management. Take for
instance the change of the allocation policy regarding the 1MB and
3.5-4GB holes. An external tool would have to either track such changes
or you simply could not change such things in QEMU. So what is wrong
with keeping that code in QEMU, which knows best about the internals and
already has flexible and mighty ways (command line and QMP) of
manipulating its behavior?
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html