On Fri, Feb 25, 2022 at 04:44:05PM +0200, Maxim Levitsky wrote: >On Fri, 2022-02-25 at 16:22 +0800, Zeng Guang wrote: >> Upcoming Intel CPUs will support virtual x2APIC MSR writes to the vICR, >> i.e. will trap and generate an APIC-write VM-Exit instead of intercepting >> the WRMSR. Add support for handling "nodecode" x2APIC writes, which >> were previously impossible. >> >> Note, x2APIC MSR writes are 64 bits wide. >> >> Signed-off-by: Zeng Guang <guang.zeng@xxxxxxxxx> >> --- >> arch/x86/kvm/lapic.c | 25 ++++++++++++++++++++++--- >> 1 file changed, 22 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >> index 629c116b0d3e..e4bcdab1fac0 100644 >> --- a/arch/x86/kvm/lapic.c >> +++ b/arch/x86/kvm/lapic.c >> @@ -67,6 +67,7 @@ static bool lapic_timer_advance_dynamic __read_mostly; >> #define LAPIC_TIMER_ADVANCE_NS_MAX 5000 >> /* step-by-step approximation to mitigate fluctuation */ >> #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8 >> +static int kvm_lapic_msr_read(struct kvm_lapic *apic, u32 reg, u64 *data); >> >> static inline void __kvm_lapic_set_reg(char *regs, int reg_off, u32 val) >> { >> @@ -2227,10 +2228,28 @@ EXPORT_SYMBOL_GPL(kvm_lapic_set_eoi); >> /* emulate APIC access in a trap manner */ >> void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset) >> { >> - u32 val = kvm_lapic_get_reg(vcpu->arch.apic, offset); >> + struct kvm_lapic *apic = vcpu->arch.apic; >> + u64 val; >> + >> + if (apic_x2apic_mode(apic)) { >> + /* >> + * When guest APIC is in x2APIC mode and IPI virtualization >> + * is enabled, accessing APIC_ICR may cause trap-like VM-exit >> + * on Intel hardware. Other offsets are not possible. >> + */ >> + if (WARN_ON_ONCE(offset != APIC_ICR)) >> + return; >> >> - /* TODO: optimize to just emulate side effect w/o one more write */ >> - kvm_lapic_reg_write(vcpu->arch.apic, offset, val); >> + kvm_lapic_msr_read(apic, offset, &val); >> + if (val & APIC_ICR_BUSY) >> + kvm_x2apic_icr_write(apic, val); >> + else >> + kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32)); >I don't fully understand the above code. > >First of where kvm_x2apic_icr_write is defined? Sean introduces it in his "prep work for VMX IPI virtualization" series, which is merged into kvm/queue branch. https://git.kernel.org/pub/scm/virt/kvm/kvm.git/commit/?h=queue&id=7a641ca0c219e4bbe102f2634dbc7e06072fcd3c > >Second, I thought that busy bit is not used in x2apic mode? >At least in intel's SDM, section 10.12.9 'ICR Operation in x2APIC Mode' >this bit is not defined. You are right. We will remove the pointless check against APIC_ICR_BUSY and just invoke kvm_apic_send_ipi(). In that section, SDM also says: With the removal of the Delivery Status bit, system software no longer has a reason to read the ICR. It remains readable only to aid in debugging; however, ***software should not assume the value returned by reading the ICR is the last written value***.