On Mon, Dec 10, 2018 at 11:40:47AM +0000, Mark Rutland wrote: > On Wed, Nov 28, 2018 at 02:45:15PM +0000, Steven Price wrote: > > This series add support for paravirtualized time for Arm64 guests and > > KVM hosts following the specification in Arm's document DEN 0057A: > > > > https://developer.arm.com/docs/den0057/a > > > > It implements support for Live Physical Time (LPT) which provides the > > guest with a method to derive a stable counter of time during which the > > guest is executing even when the guest is being migrated between hosts > > with different physical counter frequencies. > > > > It also implements support for stolen time, allowing the guest to > > identify time when it is forcibly not executing. > > I know that stolen time reporting is important, and I think that we > definitely want to pick up that part of the spec (once it is published > in some non-draft form). > > However, I am very concerned with the pv-freq part of LPT, and I'd like > to avoid that if at all possible. I say that because: > > * By design, it breaks architectural guarantees from the PoV of SW in > the guest. > > A VM may host multiple SW agents serially (e.g. when booting, or > across kexec), or concurrently (e.g. Linux w/ EFI runtime services), > and the host has no way to tell whether all software in the guest will > function correctly. Due to this, it's not possible to have a guest > opt-in to the architecturally-broken timekeeping. Is this necessarily true? As I understood the intention of the spec, there would be no change to behavior of the timers as exposed by the hypervisor unless a software agent specifically ops-int to LPT and pv-freq. In a scenario with Linux and UEFI running, they must clearly agree on using functionality that changes the underlying behavior. For kdump/kexec scenarios, the OS would have to tear down the functionality to work across migration after loading a secondary SW agent, which probably needs adding to the spec. > > Existing guests will not work correctly once pv-freq is in use, and if > configured without pv-freq (or if the guest fails to discover pv-freq > for any reason), the administrator may encounter anything between > subtle breakage or fatally incorrect timekeeping. > > There's plenty of SW agents other than Linux which runs in a guest, > which would need to be updated to handle pv-freq, e.g. GRUB, *BSD, > iPXE. > > Given this, I think that this is going to lead to subtle breakage in > real-world scenarios. I think we'd definitely need to limit the exposure of pv-freq to Linux and (if necessary) UEFI runtime services. Do you see scenarios where this would not be possible? [...] > > I understand that LPT is supposed to account for time lost during the > migration. Can we account for this without pv-freq? e.g. is it possible > to account for this in the same way as stolen time? > I think we can indeed account for lost time during migration or host system suspend by simply adjusting CNTVOFF_EL2 (as Steve points out, KVM already supports this, but QEMU doesn't make use of that today -- there were some patches attempting to address that recently). Thanks, Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm