Re: [RFC PATCH] arm64: KVM: Allow userspace to configure guest MPIDR_EL1

Ashok Kumar <ashoks@xxxxxxxxxxxx> · Thu, 21 Apr 2016 04:46:21 -0700

On Thu, Apr 21, 2016 at 11:56:03AM +0200, Andrew Jones wrote:
> On Thu, Apr 21, 2016 at 10:25:24AM +0100, Marc Zyngier wrote:
> > Hey Andrew,
> > 
> > On 21/04/16 08:04, Andrew Jones wrote:
> > > On Wed, Apr 20, 2016 at 06:33:54PM +0100, Marc Zyngier wrote:
> > >> On Wed, 20 Apr 2016 07:08:39 -0700
> > >> Ashok Kumar <ashoks@xxxxxxxxxxxx> wrote:
> > >>
> > >>> For guests with NUMA configuration, Node ID needs to
> > >>> be recorded in the respective affinity byte of MPIDR_EL1.
> > >>
> > >> As others have said before, the mapping between the NUMA hierarchy and
> > >> MPIDR_EL1 are completely arbitrary, and only the firmware description
> > >> can help the kernel in interpreting the affinity levels.
> > >>
> > >> If you want any patch like this one to be considered, I'd like to see
> > >> the corresponding userspace that:
> > >>
> > >> - programs the affinity into the vcpus,
> > > 
> > > I have a start on this for QEMU that I can dust off and send as an RFC
> > > soon.
> > > 
> > >> - pins the vcpus to specific physical CPUs,
> > > 
> > > This wouldn't be part of the userspace directly interacting with KVM,
> > > but rather a higher level (even higher than libvirt, e.g.
> > > openstack/ovirt). I also don't think we should need to worry about
> > > which/how the phyiscal cpus get chosen. Let's assume that entity
> > > knows how to best map the guest's virtual topology to a physical one.
> > 
> > Surely the platform emulation userspace has to implement the pinning
> > itself, because I can't see high level tools being involved in the
> > creation of the vcpu threads themselves.
> 
> The pinning comes after the threads are created, but before they are
> run. The virtual topology created for a guest may or may not map well
> to the physical topology of a given host. That's not the problem of
> the emulation though. That's a problem of a high level application
> trying to fit it.
> 
> > 
> > Also, I'd like to have a "simple" tool to test this without having to
> > deploy openstack (the day this becomes mandatory for kernel development,
> > I'll move my carrier to something more... agricultural).
> > 
> > So something in QEMU would be really good...
> > 
> 
> To test the virtual topology only requires booting a guest, whether
> the vcpus are pinned or not. To test that it was worth the effort to
> create a virtual topology does require the pinning, and the perf
> measuring. However we still don't need the pinning in QEMU. We can
> start a guest paused, run a script that does a handful of tasksets,
> and then resumes the guest. Or, just use libvirt, which allows one
> to save vcpu affinities, and thus on guest launch it will automatically
> do the affinity setting for you.
> 
> > > 
> > >> - exposes the corresponding firmware description (either DT or ACPI) to
> > >>   the kernel.
> > > 
> > > The QEMU patches I've started on already generate the DT (the cpu-map
> > > node). I started looking into how to do it for ACPI too, but there
> > > were some questions about whether or not the topology description
> > > tables added to the 6.1 spec were sufficient. I can send the DT part
> > > soon, and continue to look into the ACPI part later though.
> > 
> > That'd be great. Can you please sync with Ashok so that we have
> > something consistent between the two of you?
> 
> Yup. I'm hoping Ashok will chime in to share any userspace status
> they have.

I tested it using qemu's arm numa patchset [1] and I don't have any
changes for cpu-map.
I just used qemu's thread,core,socket information from "-smp" command
line argument to populate the affinity.
I was hoping to see the reception for this patch and then post qemu
changes.

[1] https://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00363.html

Thanks,
Ashok
> 
> Thanks,
> drew
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm