On 30/09/2023 14:58, David Woodhouse wrote:
From: David Woodhouse <dwmw@xxxxxxxxxxxx> Most of the time there's no need to kick the vCPU and deliver the timer event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast() directly from the timer callback, and only fall back to the slow path when it's necessary to do so. This gives a significant improvement in timer latency testing (using nanosleep() for various periods and then measuring the actual time elapsed). However, there was a reason¹ the fast path was dropped when this support was first added. The current code holds vcpu->mutex for all operations on the kvm->arch.timer_expires field, and the fast path introduces a potential race condition. Avoid that race by ensuring the hrtimer is (temporarily) cancelled before making changes in kvm_xen_start_timer(), and also when reading the values out for KVM_XEN_VCPU_ATTR_TYPE_TIMER. ¹ https://lore.kernel.org/kvm/846caa99-2e42-4443-1070-84e49d2f11d2@xxxxxxxxxx/ Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx> --- • v2: Remember, and deal with, those races. • v3: Drop the assertions for vcpu being loaded; those can be done separately if at all. Reorder the code in xen_timer_callback() to make it clearer that kvm->arch.xen.timer_expires is being cleared in the case where the event channel delivery is *complete*, as opposed to the -EWOULDBLOCK deferred path. Drop the 'pending' variable in kvm_xen_vcpu_get_attr() and restart the hrtimer if (kvm->arch.xen.timer_expires), which ought to be exactly the same thing (that's the *point* in cancelling the timer, to make it truthful as we return its value to userspace). Improve comments. arch/x86/kvm/xen.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+)
Reviewed-by: Paul Durrant <paul@xxxxxxx>