On Thu, Apr 21, 2016 at 10:25:24AM +0100, Marc Zyngier wrote: > Hey Andrew, > > On 21/04/16 08:04, Andrew Jones wrote: > > On Wed, Apr 20, 2016 at 06:33:54PM +0100, Marc Zyngier wrote: > >> On Wed, 20 Apr 2016 07:08:39 -0700 > >> Ashok Kumar <ashoks@xxxxxxxxxxxx> wrote: > >> > >>> For guests with NUMA configuration, Node ID needs to > >>> be recorded in the respective affinity byte of MPIDR_EL1. > >> > >> As others have said before, the mapping between the NUMA hierarchy and > >> MPIDR_EL1 are completely arbitrary, and only the firmware description > >> can help the kernel in interpreting the affinity levels. > >> > >> If you want any patch like this one to be considered, I'd like to see > >> the corresponding userspace that: > >> > >> - programs the affinity into the vcpus, > > > > I have a start on this for QEMU that I can dust off and send as an RFC > > soon. > > > >> - pins the vcpus to specific physical CPUs, > > > > This wouldn't be part of the userspace directly interacting with KVM, > > but rather a higher level (even higher than libvirt, e.g. > > openstack/ovirt). I also don't think we should need to worry about > > which/how the phyiscal cpus get chosen. Let's assume that entity > > knows how to best map the guest's virtual topology to a physical one. > > Surely the platform emulation userspace has to implement the pinning > itself, because I can't see high level tools being involved in the > creation of the vcpu threads themselves. The pinning comes after the threads are created, but before they are run. The virtual topology created for a guest may or may not map well to the physical topology of a given host. That's not the problem of the emulation though. That's a problem of a high level application trying to fit it. > > Also, I'd like to have a "simple" tool to test this without having to > deploy openstack (the day this becomes mandatory for kernel development, > I'll move my carrier to something more... agricultural). > > So something in QEMU would be really good... > To test the virtual topology only requires booting a guest, whether the vcpus are pinned or not. To test that it was worth the effort to create a virtual topology does require the pinning, and the perf measuring. However we still don't need the pinning in QEMU. We can start a guest paused, run a script that does a handful of tasksets, and then resumes the guest. Or, just use libvirt, which allows one to save vcpu affinities, and thus on guest launch it will automatically do the affinity setting for you. > > > >> - exposes the corresponding firmware description (either DT or ACPI) to > >> the kernel. > > > > The QEMU patches I've started on already generate the DT (the cpu-map > > node). I started looking into how to do it for ACPI too, but there > > were some questions about whether or not the topology description > > tables added to the 6.1 spec were sufficient. I can send the DT part > > soon, and continue to look into the ACPI part later though. > > That'd be great. Can you please sync with Ashok so that we have > something consistent between the two of you? Yup. I'm hoping Ashok will chime in to share any userspace status they have. Thanks, drew _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm