Re: [PATCH 02/10] KVM: arm64: Implement initial support for KVM_CAP_SYSTEM_COUNTER_STATE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Oliver,

Please Cc the KVM/arm64 reviewers (now added). Also, please consider
subscribing to the kvmarm mailing list so that I don't have to
manually approve your posts ;-).

On Tue, 08 Jun 2021 22:47:34 +0100,
Oliver Upton <oupton@xxxxxxxxxx> wrote:
> 
> ARMv8 provides for a virtual counter-timer offset that is added to guest
> views of the virtual counter-timer (CNTVOFF_EL2). To date, KVM has not
> provided userspace with any perception of this, and instead affords a
> value-based scheme of migrating the virtual counter-timer by directly
> reading/writing the guest's CNTVCT_EL0. This is problematic because
> counters continue to elapse while the register is being written, meaning
> it is possible for drift to sneak in to the guest's time scale. This is
> exacerbated by the fact that KVM will calculate an appropriate
> CNTVOFF_EL2 every time the register is written, which will be broadcast
> to all virtual CPUs. The only possible way to avoid causing guest time
> to drift is to restore counter-timers by offset.

Well, the current method has one huge advantage: time can never go
backward from the guest PoV if you restore what you have saved. Yes,
time can elapse, but you don't even need to migrate to observe that.

>
> Implement initial support for KVM_{GET,SET}_SYSTEM_COUNTER_STATE ioctls
> to migrate the value of CNTVOFF_EL2. These ioctls yield precise control
> of the virtual counter-timers to userspace, allowing it to define its
> own heuristics for managing vCPU offsets.

I'm not really in favour of inventing a completely new API, for
multiple reasons:

- CNTVOFF is an EL2 concept. I'd rather not expose it as such as it
  becomes really confusing with NV (which does expose its own CNTVOFF
  via the ONE_REG interface)

- You seem to allow each vcpu to get its own offset. I don't think
  that's right. The architecture defines that all PEs have the same
  view of the counters, and an EL1 guest should be given that
  illusion.

- by having a parallel save/restore interface, you make it harder to
  reason about what happens with concurrent calls to both interfaces

- the userspace API is already horribly bloated, and I'm not overly
  keen on adding more if we can avoid it.

I'd rather you extend the current ONE_REG interface and make it modal,
either allowing the restore of an absolute value or an offset for
CNTVCT_EL0. This would also keep a consistent behaviour when restoring
vcpus. The same logic would apply to the physical offset.

As for how to make it modal, we have plenty of bits left in the
ONE_REG encoding. Pick one, and make that a "relative" attribute. This
will result in some minor surgery in the get/set code paths, but at
least no entirely new mechanism.

One question though: how do you plan to reliably compute the offset?
As far as I can see, it is subject to the same issues you described
above (while the guest is being restored, time flies), and you have
the added risk of exposing a counter going backward from a guest
perspective.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux