On Fri, 11 Oct 2019 14:54:31 +0100 Mark Rutland <mark.rutland@xxxxxxx> wrote: > On Fri, Oct 11, 2019 at 02:33:43PM +0100, Marc Zyngier wrote: > > On Fri, 11 Oct 2019 11:50:11 +0100 > > Mark Rutland <mark.rutland@xxxxxxx> wrote: > > > > > Hi, > > > > > > On Fri, Oct 11, 2019 at 11:19:00AM +0530, Sai Prakash Ranjan wrote: > > > > On latest QCOM SoCs like SM8150 and SC7180 with big.LITTLE arch, below > > > > warnings are observed during bootup of big cpu cores. > > > > > > For reference, which CPUs are in those SoCs? > > > > > > > SM8150: > > > > > > > > [ 0.271177] CPU features: SANITY CHECK: Unexpected variation in > > > > SYS_ID_AA64PFR0_EL1. Boot CPU: 0x00000011112222, CPU4: 0x00000011111112 > > > > > > The differing fields are EL3, EL2, and EL1: the boot CPU supports > > > AArch64 and AArch32 at those exception levels, while the secondary only > > > supports AArch64. > > > > > > Do we handle this variation in KVM? > > > > We do, at least at vcpu creation time (see kvm_reset_vcpu). But if one > > of the !AArch32 CPU comes in late in the game (after we've started a > > guest), all bets are off (we'll schedule the 32bit guest on that CPU, > > enter the guest, immediately take an Illegal Exception Return, and > > return to userspace with KVM_EXIT_FAIL_ENTRY). > > Ouch. We certainly can't remove the warning untill we deal with that > somehow, then. Indeed. Same thing applies for hot-removing the AArch32-capable CPUs, by the way. You'd end-up in a situation where guests can't run, despite the initial contract that we're happy that configuration. > > Not sure we could do better, given the HW. My preference would be to > > fail these CPUs if they aren't present at boot time. > > I agree; I think we need logic to check the ID register fields against > their EXACT, {LOWER,HIGHER}_SAFE, etc rules regardless of whether we > have an associated cap. That can then abort a late onlining of a CPU > which violates those rules w.r.t. the finalised system value. > > I suspect that we may want to split the notion of > safe-for-{user,kernel-guest} in the feature tables, as if nothing else > it will force us to consider those cases separately when adding new > stuff. Probably. There are bizarre overlaps, in the sense that some capabilities (such as this AArch32 EL1 support) are firmly kernel related, and yet have a direct impact on userspace. KVM blurs the lines in "interesting" ways... :-(. Thanks, M. -- Jazz is not dead. It just smells funny...