On Tue, May 16, 2017 at 10:50:00PM +0200, Radim Krčmář wrote: > kvm master clock usually has a different frequency than the kernel boot > clock. This is not a problem until the master clock is updated; > update uses the current kernel boot clock to compute new kvm clock, > which erases any kvm clock cycles that might have built up due to > frequency difference over a long period. > > KVM_SET_CLOCK is one of places where we can safely update master clock > as the guest-visible clock is going to be shifted anyway. > > The problem with current code is that it updates the kvm master clock > after updating the offset. If the master clock was enabled before > calling KVM_SET_CLOCK, then it might have built up a significant delta > from kernel boot clock. > In the worst case, the time set by userspace would be shifted by so much > that it couldn't have been set at any point during KVM_SET_CLOCK. > > To fix this, move kvm_gen_update_masterclock() before computing > kvmclock_offset, which means that the master clock and kernel boot clock > will be sufficiently close together. > Another solution would be to replace get_kvmclock_ns() with > "ktime_get_boot_ns() + ka->kvmclock_offset", which is marginally more > accurate, but would break symmetry with KVM_GET_CLOCK. > > Signed-off-by: Radim Krčmář <rkrcmar@xxxxxxxxxx> > --- > Marcelo, > I found no problem if master clock was not enabled before > kvm_gen_update_masterclock(), but you mentioned that your code depends > on this change -- what have I missed? kvm_gen_update_masterclock enables masterclock, which allows get_kvmclock_ns() to read the proper value. ACK. > Thanks. > --- > I regret not pressing harder when we sanctified this frequency > difference ... too late to make kvm clock follow the boot clock? :) Don't get the point. The frequency difference you mean is: 1. host monotonic clock (interpolated TSC reads from timer interrupt + frequency correction). 2. TSC. kvm clock follow the boot clock? it would follow 1. Well, i haven't thought how to make frequent updates to the kvmclock area of every vCPU while allowing it (kvmclock) to be monotonic across VCPUs. With TSC only, you use the fact that HW guarantees monotonicity across TSCs, so you guarantee monotonicity across kvmclock reads as well which allows for clock_gettime vsyscall in guest userspace. Thats the whole point. Yes, following 1. would get rid of a lot of problems... can you think of a proposal? > --- > arch/x86/kvm/x86.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index b54125b590e8..b8aad0969690 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4180,9 +4180,15 @@ long kvm_arch_vm_ioctl(struct file *filp, > goto out; > > r = 0; > + /* > + * TODO: userspace has to take care of races with VCPU_RUN, so > + * kvm_gen_update_masterclock() can be cut down to locked > + * pvclock_update_vm_gtod_copy(). > + */ I really don't get that comment. Care to explain what "cut down" means? > + kvm_gen_update_masterclock(kvm); > now_ns = get_kvmclock_ns(kvm); > kvm->arch.kvmclock_offset += user_ns.clock - now_ns; > - kvm_gen_update_masterclock(kvm); > + kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE); > break; > } > case KVM_GET_CLOCK: { > -- > 2.13.0