On 26/11/2014 18:01, Nadav Amit wrote: > Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > >> >> >> On 06/11/2014 17:45, Radim Krčmář wrote: >>> 2014-11-06 10:34+0100, Paolo Bonzini: >>>> On 05/11/2014 21:45, Nadav Amit wrote: >>>>> If I understand the SDM correctly, in such scenario (all APICs are >>>>> software disabled) the mode is left as the default - flat mode (see >>> >>> APIC doesn't have any global mode (it is just KVM's simplification), so >>> when a message lands on the system bus, it just compares MDA with LDR >>> and DFR ... >>> >>>>> section 10.6.2.2 "Logical Destination Mode”): "All processors that >>>>> have their APIC software enabled (using the spurious vector >>>>> enable/disable bit) must have their DFRs (Destination Format >>>>> Registers) programmed identically. The default mode for DFR is flat >>>>> mode.” >>> >>> I think the "default mode" points to reset state, which is flat DFR; >>> and it is mentioned only because of the following sentence >>> If you are using cluster mode, DFRs must be programmed before the APIC >>> is software enabled. >>> >>>> That's not what either Bochs or QEMU do, though. (Though in the case of >>>> Bochs I cannot find the place where reception of IPIs is prevented for >>>> software-disabled APICs, so I'm not sure how much to trust it in this case). >>>> >>>> I'm not sure why software-disabled APICs could have different DFRs, if >>>> the APICs can receive NMI IPIs. I'll ask Intel. >>> >>> When changing the mode, we can't switch DFR synchronously, so it has to >>> happen and NMI may be needed (watchdog?) before APIC configuration. >>> Explicit statement might have been a hint to be _extra_ careful when >>> using logical destination for INIT, NMI, ... I wonder what they'll say. >>> >>> Anyway, Paolo's patch seems to be in the right direction, I'd modify it >>> a bit though: >>> >>> LDR=0 isn't addressable in any logical mode, so we can ignore APICs that >>> don't set it and decide the mode by the last nonzero one. >>> This works in a situation, where one part is configured for cluster and >>> the rest is still in reset state. >>> >>> (It gets harder if we allow nonzero LDRs with different DFR ... >>> we'd need to split our logical map to handle it.) >>> >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >>> index 758f838..6da303e1 100644 >>> --- a/arch/x86/kvm/lapic.c >>> +++ b/arch/x86/kvm/lapic.c >>> @@ -148,10 +148,6 @@ static void recalculate_apic_map(struct kvm *kvm) >>> goto out; >>> >>> new->ldr_bits = 8; >>> - /* flat mode is default */ >>> - new->cid_shift = 8; >>> - new->cid_mask = 0; >>> - new->lid_mask = 0xff; >>> new->broadcast = APIC_BROADCAST; >>> >>> kvm_for_each_vcpu(i, vcpu, kvm) { >>> @@ -166,7 +162,7 @@ static void recalculate_apic_map(struct kvm *kvm) >>> new->cid_mask = (1 << KVM_X2APIC_CID_BITS) - 1; >>> new->lid_mask = 0xffff; >>> new->broadcast = X2APIC_BROADCAST; >>> - } else if (kvm_apic_hw_enabled(apic)) { >>> + } else if (kvm_apic_get_reg(apic, APIC_LDR)) { >>> if (kvm_apic_get_reg(apic, APIC_DFR) == >>> APIC_DFR_CLUSTER) { >>> new->cid_shift = 4; >> >> I merged this patch and Nadav’s. > > Sorry for the late and long reply, but I got an issue with the new version > (and my previous version as well). Indeed, the SDM states that DFR should > be the same for enabled CPUs, and that the BIOS should get all CPUs in > either xAPIC or x2APIC. Yet, there is nothing that says all CPUs need to be > in xAPIC/x2APIC mode. > > In my tests (which pass on bare-metal), I got a scenario in which some CPUs > are in xAPIC mode, the BSP changed (which is currently not handled correctly > by KVM) and the BSP has x2APIC enabled. All the core APICs are > software-enabled. The expected behaviour is that the CPUs with x2APIC > enabled would work in x2APIC mode. > > I think such a transitory scenario is possible on real-systems as well, > perhaps during CPU hot-plug. It appears the previous version (before all of > our changes) handled it better. I presume the most efficient way is to start > determining the APIC logical mode from the BSP, and if it is disabled, > traverse the rest of the CPUs until finding the first one with APIC enabled. > Yet, I have not finished doing and checking the BSP fix and other dependent > INIT signal handling fixes. > > In the meanwhile, would you be ok with restoring some of the previous > behaviour - i.e., x2APIC is enabled if any CPU turned it on (regardless to > whether APIC is software enabled), otherwise use the configuration of the > last enabled APIC? > > -- >8 — > Subject: [PATCH] KVM: x86: Traverse all CPUs during recalculate_apic_map > > --- > arch/x86/kvm/lapic.c | 20 +++++++++++--------- > 1 file changed, 11 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 9c90d31..6dc2be6 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -139,6 +139,7 @@ static void recalculate_apic_map(struct kvm *kvm) > struct kvm_apic_map *new, *old = NULL; > struct kvm_vcpu *vcpu; > int i; > + bool any_enabled = false; > > new = kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL); > > @@ -160,13 +161,21 @@ static void recalculate_apic_map(struct kvm *kvm) > if (!kvm_apic_present(vcpu)) > continue; > > + /* > + * All APICs DFRs have to be configured the same mode by an OS. > + * We take advatage of this while building logical id lookup > + * table. After reset APICs are in software disabled mode, so if > + * we find apic with different setting we assume this is the mode > + * OS wants all apics to be in; build lookup table accordingly. > + */ > if (apic_x2apic_mode(apic)) { > new->ldr_bits = 32; > new->cid_shift = 16; > new->cid_mask = (1 << KVM_X2APIC_CID_BITS) - 1; > new->lid_mask = 0xffff; > new->broadcast = X2APIC_BROADCAST; > - } else if (kvm_apic_get_reg(apic, APIC_LDR)) { > + break; > + } else if (!any_enabled && kvm_apic_get_reg(apic, APIC_LDR)) { > if (kvm_apic_get_reg(apic, APIC_DFR) == > APIC_DFR_CLUSTER) { > new->cid_shift = 4; > @@ -179,15 +188,8 @@ static void recalculate_apic_map(struct kvm *kvm) > } > } > > - /* > - * All APICs have to be configured in the same mode by an OS. > - * We take advatage of this while building logical id loockup > - * table. After reset APICs are in software disabled mode, so if > - * we find apic with different setting we assume this is the mode > - * OS wants all apics to be in; build lookup table accordingly. > - */ > if (kvm_apic_sw_enabled(apic)) > - break; > + any_enabled = true; > } > > kvm_for_each_vcpu(i, vcpu, kvm) { > Yes, this looks good. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html