On 11/21/2011 04:50 PM, Chris Wright wrote:
* Peter Zijlstra (a.p.zijlstra@xxxxxxxxx) wrote:
On Mon, 2011-11-21 at 21:30 +0530, Bharata B Rao wrote:
In the original post of this mail thread, I proposed a way to export
guest RAM ranges (Guest Physical Address-GPA) and their corresponding host
host virtual mappings (Host Virtual Address-HVA) from QEMU (via QEMU monitor).
The idea was to use this GPA to HVA mappings from tools like libvirt to bind
specific parts of the guest RAM to different host nodes. This needed an
extension to existing mbind() to allow binding memory of a process(QEMU) from a
different process(libvirt). This was needed since we wanted to do all this from
libvirt.
Hence I was coming from that background when I asked for extending
ms_mbind() to take a tid parameter. If QEMU community thinks that NUMA
binding should all be done from outside of QEMU, it is needed, otherwise
what you have should be sufficient.
That's just retarded, and no you won't get such extentions. Poking at
another process's virtual address space is just daft. Esp. if there's no
actual reason for it.
Need to separate the binding vs the policy mgmt. The policy mgmt could
still be done outside, whereas the binding could still be done from w/in
QEMU. A simple monitor interface to rebalance vcpu memory allcoations
to different nodes could very well schedule vcpu thread work in QEMU.
I really would prefer to avoid having such an interface. It's a shot gun that
will only result in many poor feet being maimed. I can't tell you the number of
times I've encountered people using CPU pinning when they have absolutely no
business doing CPU pinning.
If we really believe such an interface should exist, then the interface should
really be from the kernel. Once we have memgroups, there's no reason to involve
QEMU at all. QEMU can define the memgroups based on the NUMA nodes and then
it's up to the kernel as to whether it exposes controls to explicitly bind
memgroups within a process or not.
Regards,
Anthony Liguori
So, I agree, even if there is some external policy mgmt, it could still
easily work w/ QEMU to use Peter's proposed interface.
thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html