Re: [PATCH 0/3][RFC] NUMA: add host side pinning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Anthony Liguori wrote:
> On 06/24/2010 06:42 AM, Avi Kivity wrote:
>> On 06/24/2010 02:34 PM, Andre Przywara wrote:
>>>> Non-anonymous memory doesn't work well with ksm and transparent
>>>> hugepages.  Is it possible to use anonymous memory rather than file
>>>> backed?
>>>
>>> I'd prefer non-file backed, too. But that is how the current huge
>>> pages implementation is done. We could use MAP_HUGETLB and declare
>>> NUMA _and_ huge pages as 2.6.32+ only. Unfortunately I didn't find
>>> an easy way to detect the presence of the MAP_HUGETLB flag. If the
>>> kernel does not support it, it seems that mmap silently ignores it
>>> and uses 4KB pages instead.
>>
>> That sucks, unfortunately it is normal practice.  However it is a
>> soft failure, everything works just a bit slower.  So it's probably
>> acceptable.
>>
>>>>> To avoid this I'd like to see the pinning done from within QEMU. I
>>>>> am not sure whether calling numactl via system() and friends is
>>>>> OK, I'd prefer to run the syscalls directly (like in patch 3/3)
>>>>> and pull the necessary options into the -numa pin,... command
>>>>> line. We could mimic numactl's syntax here.
>>>>
>>>> Definitely not use system(), but IIRC numactl has a library interface?
>>> Right, that is what I include in patch 3/3 and use. I got the
>>> impression Anthony wanted to avoid reimplementing parts of numactl,
>>> especially enabling the full flexibility of the command line
>>> interface (like specifying nodes, policies and interleaving).
>>> I want QEMU to use the library and pull the necessary options into
>>> the -numa pin,... parsing, even if this means duplicating numactl
>>> functionality.
>>>
>>
>> I agree with that.  It's a lot easier to use a single tool than to
>> try to integrate things yourself, the unix tradition of grep | sort |
>> uniq -c | sort -n notwithstanding.  Especially when one of the tools
>> is qemu.
>
> I could disagree more here.  This is why we don't support CPU pinning
> and instead provide PID information for each VCPU thread.
>
> The folks that want to use pinning are not notice users.  They are not
> going to be happy unless you can make full use of existing tools. 
> That means replicating all of numactl's functionality (which is not
> what the current patches do) or enable numactl to be used with a guest.

So how about some QMP plumbing that would allow numactl to create the
VMs at defined ranges? So you'd basically get numactl --run-qemu --
qemu-kvm -blah -foo

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux