From: Michael Kelley <mikelley@xxxxxxxxxxxxx> Sent: Sunday, June 13, 2021 7:42 PM > > From: Mark Rutland <mark.rutland@xxxxxxx> Sent: Thursday, June 10, 2021 9:45 AM > > > > Hi Michael, > > > > [trimming the bulk of the thrread] > > > > On Tue, Jun 08, 2021 at 03:36:06PM +0000, Michael Kelley wrote: > > > I've had a couple rounds of discussions with the Hyper-V team. For > > > the clocksource we've agreed to table the live migration discussion, and > > > I'll resubmit the code so that arm_arch_timer.c provides the > > > standard arch_sys_counter clocksource. As noted previously, this just > > > works for a Hyper-V guest. The live migration discussion may come > > > back later after a deeper investigation by Hyper-V. > > > > Great; thanks for this! > > > > > For clockevents, there's not a near term fix. It's more than just plumbing > > > an interrupt for Hyper-V to virtualize the ARM64 arch timer in a guest VM. > > > From their perspective there's also benefit in having a timer abstraction > > > that's independent of the architecture, and in the Linux guest, the STIMER > > > code is common across x86/x64 and ARM64. It follows the standard Linux > > > clockevents model, as it should. The code is already in use in out-of-tree > > > builds in the Linux VMs included in Windows 10 on ARM64 as part of the > > > so-called "Windows Subsystem for Linux". > > > > > > So I'm hoping we can get this core support for ARM64 guests on Hyper-V > > > into upstream using the existing STIMER support. At some point, Hyper-V > > > will do the virtualization of the ARM64 arch timer, but we don't want to > > > have to stay out-of-tree until after that happens. > > > > My main concern here is making sure that we can rely on architected > > properties, and don't have to special-case architected bits for hyperv > > (or any other hypervisor), since that inevitably causes longer-term > > pain. > > > > While in abstract I'm not as worried about using the timer > > clock_event_device specifically, that same driver provides the > > clocksource and the event stream, and I want those to work as usual, > > without being tied into the hyperv code. IIUC that will require some > > work, since the driver won't register if the GTDT is missing timer > > interrupts (or if there is no GTDT). > > > > I think it really depends on what that looks like. > > Mark, > > Here are the details: > > The existing initialization and registration code in arm_arch_timer.c > works in a Hyper-V guest with no changes. As previously mentioned, > the GTDT exists and is correctly populated. Even though it isn't used, > there's a PPI INTID specified for the virtual timer, just so > the "arm_sys_timer" clockevent can be initialized and registered. > The IRQ shows up in the output of "cat /proc/interrupts" with zero counts > for all CPUs since no interrupts are ever generated. The EL1 virtual > timer registers (CNTV_CVAL_EL0, CNTV_TVAL_EL0, and CNTV_CTL_EL0) > are accessible in the VM. The "arm_sys_timer" clockevent is left in > a shutdown state with CNTV_CTL_EL0.ENABLE set to zero when the > Hyper-V STIMER clockevent is registered with a higher rating. > > Event streams are initialized and the __delay() implementation > for ARM64 inside the kernel works. However, on the Ampere > eMAG hardware I'm using for testing, the WFE instruction returns > more quickly than it should even though the event stream fields in > CNTKCTL_EL1 are correct. I have a query in to the Hyper-V team > to see if they are trapping WFE and just returning, vs. perhaps the > eMAG processor takes the easy way out and has WFE just return > immediately. I'm not knowledgeable about other uses of timer > event streams, so let me know if there are other usage scenarios > I should check. I confirmed that Hyper-V is not trapping the WFE instruction. And on a Marvell TX2 and on an Ampere Altra, the counter event stream and WFE provide the expected delay. Evidently WFE on the eMAG doesn't actually delay. Bottom line: event streams work as expected in a Hyper-V VM. No changes needed to arm_arch_timer.[ch]. Michael > > Finally, the "arch_sys_counter" clocksource gets initialized and > setup correctly. If the Hyper-V clocksource is also initialized, > you can flip between the two clocksources at runtime as expected. > If the Hyper-V clocksource is not setup, then Linux in the VM runs > fine with the "arch_sys_counter" clocksource. > > Michael