Re: [PATCH 0/3][RFC] NUMA: add host side pinning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Avi Kivity wrote:
On 06/24/2010 01:58 PM, Andre Przywara wrote:
So who would create the /dev/shm/nodeXX files?
Currently it is QEMU. It creates a somewhat unique filename, opens and unlinks it. The difference would be to name the file after the option and to not unlink it.

I can imagine starting numactl before qemu, even though that's
cumbersome. I don't think it's feasible to start numactl after
qemu is running. That'd involve way too much magic that I'd prefer
qemu to call numactl itself.
Using the current code the files would not exist before QEMU allocated RAM, and after that it could already touch pages before numactl set the policy.

Non-anonymous memory doesn't work well with ksm and transparent hugepages. Is it possible to use anonymous memory rather than file backed?
I'd prefer non-file backed, too. But that is how the current huge pages implementation is done. We could use MAP_HUGETLB and declare NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find an easy way to detect the presence of the MAP_HUGETLB flag. If the kernel does not support it, it seems that mmap silently ignores it and uses 4KB pages instead.

To avoid this I'd like to see the pinning done from within QEMU. I am not sure whether calling numactl via system() and friends is OK, I'd prefer to run the syscalls directly (like in patch 3/3) and pull the necessary options into the -numa pin,... command line. We could mimic numactl's syntax here.

Definitely not use system(), but IIRC numactl has a library interface?
Right, that is what I include in patch 3/3 and use. I got the impression Anthony wanted to avoid reimplementing parts of numactl, especially enabling the full flexibility of the command line interface (like specifying nodes, policies and interleaving). I want QEMU to use the library and pull the necessary options into the -numa pin,... parsing, even if this means duplicating numactl functionality.

Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux