On 28/01/16 20:12, Chris Metcalf wrote: > On 01/27/2016 04:12 AM, Marc Zyngier wrote: >> On 26/01/16 20:43, Chris Metcalf wrote: >>> On 01/18/2016 04:28 AM, Marc Zyngier wrote: >>>> Hi Chris, >>>> >>>> On 15/01/16 20:02, Chris Metcalf wrote: >>>>> We are using GICv2 compatibility mode in the Fast Models/Foundation >>>>> Models simulations we are running because the boot code (ATF/UEFI) >>>>> doesn't support GICv3 in our system at the moment. >>>>> >>>>> However, starting with kernel 4.2, the guest couldn't boot up because it >>>>> wasn't getting timer interrupts. I tracked this down to a kernel commit >>>>> that switched to using the "alternatives" mechanism -- rather than >>>>> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >>>>> code just configured the code that saves/restores the vgic state based >>>>> on the presence of the system register interface to the GIC CPU >>>>> interface. See the attached patch for a fix that manages this >>>>> differently and allows me to boot up the guest in this configuration. >>>>> >>>>> However, even assuming this patch can be taken into an upstream tree, I >>>>> still have a couple of additional problems: >>>>> >>>>> - I can boot up with the Foundation Models using this change, but not >>>>> with the Fast Models (again, using a v3 GIC but in v2 compatibility mode >>>>> in the device tree). The Fast Models dts looks like it has the same >>>>> configuration for the GIC and the timers so I'm not sure what's going on >>>>> here. Any suggestions appreciated. >>>>> >>>>> - Without this change, I could only boot kernels up to 4.1. With the >>>>> change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; >>>>> I haven't bisected it down yet. So any suggestions on what might be >>>>> going wrong here would also be appreciated. >>>>> >>>>> We are planning to eventually use GICv3 mode in our software stack but >>>>> for the time being I assume it is interesting to resolve issues with GIC >>>>> v2 compatibility mode on GIC v3. >>>>> >>>> I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too >>>> eager to use GICv3 (only checking the CPU capability and ignoring the >>>> actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is >>>> probably the sign of a broken firmware that enables the system register >>>> interface at EL3, letting the rest of the software stack to use GICv3 in >>>> native mode, and yet providing a GICv2 DT. >>>> >>>> This combination is unpredictable, and is likely to cause issues on >>>> some HW implementations. >>>> >>>> Could you please point me to the firmware you're using? >>>> >>>> Also, please check the following patches: >>>> >>>> 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode >>>> 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled >>>> 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling >>>> ARM64_HAS_SYSREG_GIC_CPUIF >>>> 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function >>>> d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using >>>> GICv3 sysregs >>>> >>>> Can you point me to the one that prevents you from booting? >>> The problematic commit is 963fcd4, because it calls gic_enable_sre() >>> in the host kernel even with a GICv2 DT specified, and this seems to >>> put things in a state such that we don't receive virtual timer >>> interrupts in the guest when we boot it up. (I'm not that familiar with >>> the QEMU DT but it is providing a GIC v2 to the guest.) >>> >>> With a v4.5-rc1 host, if I "return false" before the code in gic_enable_sre() >>> that tries to actually enable the SRE, and then hardcode the >>> __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() >>> routines, then my guest boots up OK. >> What if you just do the "return false"? I bet that it will work as well... > > Yes, that also works for my case. > >>> We are using a modified ARM version of EDK v3.0-rc0, and a modified >>> ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). >> Are you sure of that commit? It looks suspiciously like the ID ftom the >> kernel tree... > > Hah, good catch! The double-click-to-copy behavior is kind of flakey > on RHEL 6's default terminal, and I bet that bit me. It's 41099f4e. > >>> We certainly haven't touched any of the GIC code in either one. >>> >>> I tried to modify the host DT to enable GICv3, but then the host itself >>> hangs on boot, so clearly more is needed. (To be fair I've only tested >>> v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using >>> GICv3 so perhaps that is part of the problem. >> That's indeed part of the problem. The firmware running at EL3 insists >> on using GICv2, but still let EL2 (and EL1) use GICv3 system registers. >> Could you please dump the content of ICC_SRE_EL3 just before entering >> the kernel at EL2? If you see ICC_SRE_EL3.SRE being set, then this would >> indicate a firmware bug (and leave the system in an unpredictable >> configuration). > > Well, the firmware clearly does this intentionally. In ATF's > drivers/arm/giv/arm_gic.c, the gicv3_cpuif_setup() function has > a comment that reads: > > /******************************************************************************* > * This function does some minimal GICv3 configuration. The Firmware itself does > * not fully support GICv3 at this time and relies on GICv2 emulation as > * provided by GICv3. This function allows software (like Linux) in later stages > * to use full GICv3 features. > ******************************************************************************/ > > and the function ends with: > > val = read_icc_sre_el3(); > write_icc_sre_el3(val | ICC_SRE_EN | ICC_SRE_SRE); > > In our build environment, if I comment out those two lines, that > fixes the guest boot problem (without any hacking on the Linux side), > so that's good anyway. With this change it works for me in the > Fast Models as well as Foundation Models, too. By the look of it, you're trying to use a GICv3 firmware, and pass a GICv2 DT to the kernel. Do not do that. Either you use a GICv2 firmware (having spoken to the ATF guys, there is a GICv2 driver in there that should work for your case) and pass a GICv2 DT, or you go GICv3 all the way. A mix of the two things is completely unsupported on the model, and solidly places you in the UNPREDICTABLE category when running that on actual HW... Thanks, M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm