On Thu, Apr 21, 2016 at 11:56:03AM +0200, Andrew Jones wrote: > On Thu, Apr 21, 2016 at 10:25:24AM +0100, Marc Zyngier wrote: > > Hey Andrew, > > > > On 21/04/16 08:04, Andrew Jones wrote: > > > On Wed, Apr 20, 2016 at 06:33:54PM +0100, Marc Zyngier wrote: > > >> On Wed, 20 Apr 2016 07:08:39 -0700 > > >> Ashok Kumar <ashoks@xxxxxxxxxxxx> wrote: > > >> > > >>> For guests with NUMA configuration, Node ID needs to > > >>> be recorded in the respective affinity byte of MPIDR_EL1. > > >> > > >> As others have said before, the mapping between the NUMA hierarchy and > > >> MPIDR_EL1 are completely arbitrary, and only the firmware description > > >> can help the kernel in interpreting the affinity levels. > > >> > > >> If you want any patch like this one to be considered, I'd like to see > > >> the corresponding userspace that: > > >> > > >> - programs the affinity into the vcpus, > > > > > > I have a start on this for QEMU that I can dust off and send as an RFC > > > soon. > > > > > >> - pins the vcpus to specific physical CPUs, > > > > > > This wouldn't be part of the userspace directly interacting with KVM, > > > but rather a higher level (even higher than libvirt, e.g. > > > openstack/ovirt). I also don't think we should need to worry about > > > which/how the phyiscal cpus get chosen. Let's assume that entity > > > knows how to best map the guest's virtual topology to a physical one. > > > > Surely the platform emulation userspace has to implement the pinning > > itself, because I can't see high level tools being involved in the > > creation of the vcpu threads themselves. > > The pinning comes after the threads are created, but before they are > run. The virtual topology created for a guest may or may not map well > to the physical topology of a given host. That's not the problem of > the emulation though. That's a problem of a high level application > trying to fit it. > > > > > Also, I'd like to have a "simple" tool to test this without having to > > deploy openstack (the day this becomes mandatory for kernel development, > > I'll move my carrier to something more... agricultural). > > > > So something in QEMU would be really good... > > > > To test the virtual topology only requires booting a guest, whether > the vcpus are pinned or not. To test that it was worth the effort to > create a virtual topology does require the pinning, and the perf > measuring. However we still don't need the pinning in QEMU. We can > start a guest paused, run a script that does a handful of tasksets, > and then resumes the guest. Or, just use libvirt, which allows one > to save vcpu affinities, and thus on guest launch it will automatically > do the affinity setting for you. > > > > > > >> - exposes the corresponding firmware description (either DT or ACPI) to > > >> the kernel. > > > > > > The QEMU patches I've started on already generate the DT (the cpu-map > > > node). I started looking into how to do it for ACPI too, but there > > > were some questions about whether or not the topology description > > > tables added to the 6.1 spec were sufficient. I can send the DT part > > > soon, and continue to look into the ACPI part later though. > > > > That'd be great. Can you please sync with Ashok so that we have > > something consistent between the two of you? > > Yup. I'm hoping Ashok will chime in to share any userspace status > they have. I tested it using qemu's arm numa patchset [1] and I don't have any changes for cpu-map. I just used qemu's thread,core,socket information from "-smp" command line argument to populate the affinity. I was hoping to see the reception for this patch and then post qemu changes. [1] https://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00363.html Thanks, Ashok > > Thanks, > drew _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm