On 02/08/2018 17:08, Vitaly Kuznetsov wrote: > When VMX is used with flexpriority disabled (because of no support or > if disabled with module parameter) MMIO interface to lAPIC is still > available in x2APIC mode while it shouldn't be (kvm-unit-tests): > > PASS: apic_disable: Local apic enabled in x2APIC mode > PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set > FAIL: apic_disable: *0xfee00030: 50014 > > The issue appears because we basically do nothing while switching to > x2APIC mode when APIC access page is not used. apic_mmio_{read,write} > only check if lAPIC is disabled before proceeding to actual write. > > When APIC access is virtualized we correctly manipulate with VMX controls > in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes > in x2APIC mode so there's no issue. > > Disabling MMIO interface seems to be easy. The question is: what do we > do with these reads and writes? If we add apic_x2apic_mode() check to > apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will > go to userspace. When lAPIC is in kernel, Qemu uses this interface to > inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This > somehow works with disabled lAPIC but when we're in xAPIC mode we will > get a real injected MSI from every write to lAPIC. Not good. > > The simplest solution seems to be to just ignore writes to the region > and return ~0 for all reads when we're in x2APIC mode. This is what this > patch does. However, this approach is inconsistent with what currently > happens when flexpriority is enabled: we allocate APIC access page and > create KVM memory region so in x2APIC modes all reads and writes go to > this pre-allocated page which is, btw, the same for all vCPUs. > > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > --- > Changes since RFC: > - add KVM_X86_QUIRK_LAPIC_MMIO_HOLE disabling the newly introduced > 'MMIO hole' behavior [Paolo Bonzini] > --- > arch/x86/include/uapi/asm/kvm.h | 1 + > arch/x86/kvm/lapic.c | 22 +++++++++++++++++++--- > 2 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > index 86299efa804a..fd23d5778ea1 100644 > --- a/arch/x86/include/uapi/asm/kvm.h > +++ b/arch/x86/include/uapi/asm/kvm.h > @@ -377,6 +377,7 @@ struct kvm_sync_regs { > > #define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) > #define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) > +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2) > > #define KVM_STATE_NESTED_GUEST_MODE 0x00000001 > #define KVM_STATE_NESTED_RUN_PENDING 0x00000002 > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index b5cd8465d44f..83c4e8cc7eb9 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -1291,9 +1291,8 @@ EXPORT_SYMBOL_GPL(kvm_lapic_reg_read); > > static int apic_mmio_in_range(struct kvm_lapic *apic, gpa_t addr) > { > - return kvm_apic_hw_enabled(apic) && > - addr >= apic->base_address && > - addr < apic->base_address + LAPIC_MMIO_LENGTH; > + return addr >= apic->base_address && > + addr < apic->base_address + LAPIC_MMIO_LENGTH; > } > > static int apic_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this, > @@ -1305,6 +1304,15 @@ static int apic_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device *this, > if (!apic_mmio_in_range(apic, address)) > return -EOPNOTSUPP; > > + if (!kvm_apic_hw_enabled(apic) || apic_x2apic_mode(apic)) { > + if (!kvm_check_has_quirk(vcpu->kvm, > + KVM_X86_QUIRK_LAPIC_MMIO_HOLE)) > + return -EOPNOTSUPP; > + > + memset(data, 0xff, len); > + return 0; > + } > + > kvm_lapic_reg_read(apic, offset, len, data); > > return 0; > @@ -1864,6 +1872,14 @@ static int apic_mmio_write(struct kvm_vcpu *vcpu, struct kvm_io_device *this, > if (!apic_mmio_in_range(apic, address)) > return -EOPNOTSUPP; > > + if (!kvm_apic_hw_enabled(apic) || apic_x2apic_mode(apic)) { > + if (!kvm_check_has_quirk(vcpu->kvm, > + KVM_X86_QUIRK_LAPIC_MMIO_HOLE)) > + return -EOPNOTSUPP; > + > + return 0; > + } > + > /* > * APIC register must be aligned on 128-bits boundary. > * 32/64/128 bits registers must be accessed thru 32 bits. > Queued, thanks. Paolo