Roman Kagan <rkagan@xxxxxxxxxxxxx> writes: > Now that there's Hyper-V IOMMU driver, Linux can switch to x2apic mode > when supported by the vcpus. > > However, the apic access functions for Hyper-V enlightened apic assume > xapic mode only. > > As a result, Linux fails to bring up secondary cpus when run as a guest > in QEMU/KVM with both hv_apic and x2apic enabled. > > I didn't manage to make my instance of Hyper-V expose x2apic to the > guest; nor does Hyper-V spec document the expected behavior. However, > a Windows guest running in QEMU/KVM with hv_apic and x2apic and a big > number of vcpus (so that it turns on x2apic mode) does use enlightened > apic MSRs passing unshifted 32bit destination id and falls back to the > regular x2apic MSRs for less frequently used apic fields. > > So implement the same behavior, by replacing enlightened apic access > functions (only those where it makes a difference) with their > x2apic-aware versions when x2apic is in use. > > Fixes: 29217a474683 ("iommu/hyper-v: Add Hyper-V stub IOMMU driver") > Fixes: 6b48cb5f8347 ("X86/Hyper-V: Enlighten APIC access") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Roman Kagan <rkagan@xxxxxxxxxxxxx> > --- > v1 -> v2: > - add ifdefs to handle !CONFIG_X86_X2APIC > > arch/x86/hyperv/hv_apic.c | 54 ++++++++++++++++++++++++++++++++++++--- > 1 file changed, 51 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c > index 5c056b8aebef..eb1434ae9e46 100644 > --- a/arch/x86/hyperv/hv_apic.c > +++ b/arch/x86/hyperv/hv_apic.c > @@ -84,6 +84,44 @@ static void hv_apic_write(u32 reg, u32 val) > } > } > > +#ifdef CONFIG_X86_X2APIC > +static void hv_x2apic_icr_write(u32 low, u32 id) > +{ > + wrmsr(HV_X64_MSR_ICR, low, id); > +} AFAIU you're trying to mirror native_x2apic_icr_write() here but this is different from what hv_apic_icr_write() does (SET_APIC_DEST_FIELD(id)). Is it actually correct? (I think you've tested this and it is but) Michael, could you please shed some light here? > + > +static u32 hv_x2apic_read(u32 reg) > +{ > + u32 reg_val, hi; > + > + switch (reg) { > + case APIC_EOI: > + rdmsr(HV_X64_MSR_EOI, reg_val, hi); > + return reg_val; > + case APIC_TASKPRI: > + rdmsr(HV_X64_MSR_TPR, reg_val, hi); > + return reg_val; > + > + default: > + return native_apic_msr_read(reg); > + } > +} > + > +static void hv_x2apic_write(u32 reg, u32 val) > +{ > + switch (reg) { > + case APIC_EOI: > + wrmsr(HV_X64_MSR_EOI, val, 0); > + break; > + case APIC_TASKPRI: > + wrmsr(HV_X64_MSR_TPR, val, 0); > + break; > + default: > + native_apic_msr_write(reg, val); > + } > +} > +#endif /* CONFIG_X86_X2APIC */ > + > static void hv_apic_eoi_write(u32 reg, u32 val) > { > struct hv_vp_assist_page *hvp = hv_vp_assist_page[smp_processor_id()]; > @@ -262,9 +300,19 @@ void __init hv_apic_init(void) > if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) { > pr_info("Hyper-V: Using MSR based APIC access\n"); > apic_set_eoi_write(hv_apic_eoi_write); > - apic->read = hv_apic_read; > - apic->write = hv_apic_write; > - apic->icr_write = hv_apic_icr_write; > +#ifdef CONFIG_X86_X2APIC > + if (x2apic_enabled()) { > + apic->read = hv_x2apic_read; > + apic->write = hv_x2apic_write; > + apic->icr_write = hv_x2apic_icr_write; > + } else { > +#endif > + apic->read = hv_apic_read; > + apic->write = hv_apic_write; > + apic->icr_write = hv_apic_icr_write; (just wondering): Is it always safe to assume that we cannot switch between apic_flat/x2apic in runtime? Moreover, the only difference between hv_apic_read/hv_apic_write and hv_x2apic_read/hv_x2apic_write is native_apic_mem_{read,write} -> native_apic_msr_{read,write}. Would it make sense to move if (x2apic_enabled()) and merge these functions? > +#ifdef CONFIG_X86_X2APIC > + } > +#endif > apic->icr_read = hv_apic_icr_read; > } > } -- Vitaly