On Sat, Feb 29, 2020 at 09:27:54AM +0100, Cédric Le Goater wrote: > On 2/29/20 8:54 AM, Ram Pai wrote: > > XIVE is not correctly enabled for Secure VM in the KVM Hypervisor yet. > > > > Hence Secure VM, must always default to XICS interrupt controller. > > have you tried XIVE emulation 'kernel-irqchip=off' ? yes and it hangs. I think that option, continues to enable some variant of XIVE in the VM. There are some known deficiencies between KVM and the ultravisor negotiation, resulting in a hang in the SVM. > > > If XIVE is requested through kernel command line option "xive=on", > > override and turn it off. > > This is incorrect. It is negotiated through CAS depending on the FW > capabilities and the KVM capabilities. Yes I understand, qemu/KVM have predetermined a set of capabilties that it can offer to the VM. The kernel within the VM has a list of capabilties it needs to operate correctly. So both negotiate and determine something mutually ammicable. Here I am talking about the list of capabilities that the kernel is trying to determine, it needs to operate correctly. "xive=on" is one of those capabilities the kernel is told by the VM-adminstrator, to enable. Unfortunately if the VM-administrtor blindly requests to enable it, the kernel must override it, if it knows that will be switching the VM into a SVM soon. No point negotiating a capability with Qemu; through CAS, if it knows it cannot handle that capability. > > > If XIVE is the only supported platform interrupt controller; specified > > through qemu option "ic-mode=xive", simply abort. Otherwise default to > > XICS. > > > I don't think it is a good approach to downgrade the guest kernel > capabilities this way. > > PAPR has specified the CAS negotiation process for this purpose. It > comes in two parts under KVM. First the KVM hypervisor advertises or > not a capability to QEMU. The second is the CAS negotiation process > between QEMU and the guest OS. Unfortunately, this is not viable. At the time the hypervisor advertises its capabilities to qemu, the hypervisor has no idea whether that VM will switch into a SVM or not. The decision to switch into a SVM is taken by the kernel running in the VM. This happens much later, after the hypervisor has already conveyed its capabilties to the qemu, and qemu has than instantiated the VM. As a result, CAS in prom_init is the only place where this negotiation can take place. > > The SVM specifications might not be complete yet and if some features > are incompatible, I think we should modify the capabilities advertised > by the hypervisor : no XIVE in case of SVM. QEMU will automatically > use the fallback path and emulate the XIVE device, same as setting > 'kernel-irqchip=off'. As mentioned above, this would be an excellent approach, if the Hypervisor was aware of the VM's intent to switch into a SVM. Neither the hypervisor knows, nor the qemu. Only the kernel running within the VM knows about it. Do you still think, my approach is wrong? RP