Re: [PATCH 00/19] KVM: PPC: Book3S HV: add XIVE native exploitation mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/22/19 5:46 AM, Paul Mackerras wrote:
> On Mon, Jan 07, 2019 at 07:43:12PM +0100, Cédric Le Goater wrote:
>> Hello,
>>
>> On the POWER9 processor, the XIVE interrupt controller can control
>> interrupt sources using MMIO to trigger events, to EOI or to turn off
>> the sources. Priority management and interrupt acknowledgment is also
>> controlled by MMIO in the CPU presenter subengine.
>>
>> PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need
>> special support from the hypervisor to do the same. This is called the
>> XIVE native exploitation mode and today, it can be activated under the
>> PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support
>> and still offers the old interrupt mode interface using a
>> XICS-over-XIVE glue which implements the XICS hcalls.
>>
>> The following series is proposal to add the same support under KVM.
>>
>> A new KVM device is introduced for the XIVE native exploitation
>> mode. It reuses most of the XICS-over-XIVE glue implementation
>> structures which are internal to KVM but has a completely different
>> interface. A set of Hypervisor calls configures the sources and the
>> event queues and from there, all control is done by the guest through
>> MMIOs.
>>
>> These MMIO regions (ESB and TIMA) are exposed to guests in QEMU,
>> similarly to VFIO, and the associated VMAs are populated dynamically
>> with the appropriate pages using a fault handler. This is implemented
>> with a couple of KVM device ioctls.
>>
>> On a POWER9 sPAPR machine, the Client Architecture Support (CAS)
>> negotiation process determines whether the guest operates with a
>> interrupt controller using the XICS legacy model, as found on POWER8,
>> or in XIVE exploitation mode. Which means that the KVM interrupt
>> device should be created at runtime, after the machine as started.
>> This requires extra KVM support to create/destroy KVM devices. The
>> last patches are an attempt to solve that problem.
>>
>> Migration has its own specific needs. The patchset provides the
>> necessary routines to quiesce XIVE, to capture and restore the state
>> of the different structures used by KVM, OPAL and HW. Extra OPAL
>> support is required for these.
> 
> Thanks for the patchset.  It mostly looks good, but there are some
> more things we need to consider, and I think a v2 will be needed.
>> One general comment I have is that there are a lot of acronyms in this
> code and you mostly seem to assume that people will know what they all
> mean.  It would make the code more readable if you provide the
> expansion of the acronym on first use in a comment or whatever.  For
> example, one of the patches in this series talks about the "EAS"

 Event Assignment Structure, a.k.a IVE (Interrupt Virtualization Entry)

All the names changed somewhere between XIVE v1 and XIVE v2. OPAL and
Linux should be adjusted ...

> without ever expanding it in any comment or in the patch description,
> and I have forgotten just at the moment what EAS stands for (I just
> know that understanding the XIVE is not eas-y. :)
ah ! yes. But we have great documentation :)

We pushed some high level description of XIVE in QEMU :

  https://git.qemu.org/?p=qemu.git;a=blob;f=include/hw/ppc/xive.h;h=ec23253ba448e25c621356b55a7777119a738f8e;hb=HEAD

I should do the same for Linux with a KVM section to explain the 
interfaces which do not directly expose the underlying XIVE concepts. 
It's better to understand a little what is happening under the hood.

> Another general comment is that you seem to have written all this
> code assuming we are using HV KVM in a host running bare-metal.

Yes. I didn't look at the other configurations. I thought that we could
use the kernel_irqchip=off option to begin with. A couple of checks
are indeed missing.

> However, we could be using PR KVM (either in a bare-metal host or in a
> guest), or we could be doing nested HV KVM where we are using the
> kvm_hv module inside a KVM guest and using special hypercalls for
> controlling our guests.

Yes. 

It would be good to talk a little about the nested support (offline 
may be) to make sure that we are not missing some major interface that 
would require a lot of change. If we need to prepare ground, I think
the timing is good.

The size of the IRQ number space might be a problem. It seems we 
would need to increase it considerably to support multiple nested 
guests. That said I haven't look much how nested is designed.  

> It would be perfectly acceptable for now to say that we don't yet
> support XIVE exploitation in those scenarios, as long as we then make
> sure that the new KVM capability reports false in those scenarios, and
> any attempt to use the XIVE exploitation interfaces fails cleanly.

ok. That looks the best approach for now.

> I don't see that either of those is true in the patch set as it
> stands, so that is one area that needs to be fixed.
> 
> A third general comment is that the new KVM interfaces you have added
> need to be documented in the files under Documentation/virtual/kvm.

ok. 

Thanks,

C. 





[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux