Re: [patch 3/3] PTP: add kvm PTP driver

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Mon, 16 Jan 2017 15:39:12 -0200

On Mon, Jan 16, 2017 at 06:27:58PM +0100, Radim Krcmar wrote:
> 2017-01-16 15:08-0200, Marcelo Tosatti:
> > On Mon, Jan 16, 2017 at 05:54:11PM +0100, Radim Krcmar wrote:
> >> 2017-01-16 17:26+0100, Radim Krcmar:
> >> > 2017-01-13 15:40-0200, Marcelo Tosatti:
> >> >> On Fri, Jan 13, 2017 at 04:56:58PM +0100, Radim Krcmar wrote:
> >> >> > 2017-01-13 10:01-0200, Marcelo Tosatti:
> >> >>> > +		version = pvclock_read_begin(src);
> >> >>> > +
> >> >>> > +		ret = kvm_hypercall2(KVM_HC_CLOCK_OFFSET,
> >> >>> > +				     clock_off_gpa,
> >> >>> > +				     KVM_CLOCK_OFFSET_WALLCLOCK);
> >> >>> > +		if (ret != 0) {
> >> >>> > +			pr_err("clock offset hypercall ret %lu\n", ret);
> >> >>> > +			spin_unlock(&kvm_ptp_lock);
> >> >>> > +			preempt_enable_notrace();
> >> >>> > +			return -EOPNOTSUPP;
> >> >>> > +		}
> >> >>> > +
> >> >>> > +		tspec.tv_sec = clock_off.sec;
> >> >>> > +		tspec.tv_nsec = clock_off.nsec;
> >> >>> > +
> >> >>> > +		delta = rdtsc_ordered() - clock_off.tsc;
> >> >>> > +
> >> >>> > +		offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
> >> >>> > +					     src->tsc_shift);
> >> >>> > +
> >> >>> > +	} while (pvclock_read_retry(src, version));
> >> >>> > +
> >> >>> > +	preempt_enable_notrace();
> >> >>> > +
> >> >>> > +	tspec.tv_nsec = tspec.tv_nsec + offset;
> >> >>> > +
> >> >>> > +	spin_unlock(&kvm_ptp_lock);
> >> >>> > +
> >> >>> > +	if (tspec.tv_nsec >= NSEC_PER_SEC) {
> >> >>> > +		u64 secs = tspec.tv_nsec;
> >> >>> > +
> >> >>> > +		tspec.tv_nsec = do_div(secs, NSEC_PER_SEC);
> >> >>> > +		tspec.tv_sec += secs;
> >> >>> > +	}
> >> >>> > +
> >> >>> > +	memcpy(ts, &tspec, sizeof(struct timespec64));
> >> >>> 
> >> >>> But the whole idea is of improving the time by reading tsc a bit later
> >> >>> is just weird ... why is it better to provide
> >> >>> 
> >> >>>   tsc + x, time + tsc_delta_to_time(x)
> >> >>> 
> >> >>> than just
> >> >>> 
> >> >>>  tsc, time
> >> >>> 
> >> >>> ?
> >> >> 
> >> >> Because you want to calculate the value of the host realtime clock 
> >> >> at the moment of ptp_kvm_gettime.
> >> >> 
> >> >> We do:
> >> >> 
> >> >> 	1. kvm_hypercall.
> >> >> 	2. get {sec, nsec, guest_tsc}.
> >> >> 	3. kvm_hypercall returns.
> >> >> 	4. delay = rdtsc() - guest_tsc.
> >> >> 
> >> >> Where delay is the delta (measured with the TSC) between points 2 and 4.
> >> > 
> >> > I see now ... the PTP interface is just not good for our purposes.
> >> 
> >> There is getcrosststamp() callback in PTP, which seems to be exactly
> >> what we want when pairing with TSC, so the pvclock delay fixup can be
> >> dropped when using it.
> > 
> > What pvclock delay fixup you refer to? The "rdtsc() - clock_offset.tsc"
> > part?
> 
> Yes.
> 
> >       You can't drop it, because if you do then your "host realtime
> > clock read" will be behind by "rdtsc() - clock_offset.tsc" TSC cycles.
> 
> The TSC read will be some cycles old when the hypercall ends, but that
> doesn't matter, because we will pass {sec, nsec, guest_tsc} to PTP and
> PTP should plug them into kernel's realtime clock roughly like this:
> 
>   sec/nsec + (rdtsc() - guest_tsc) * tsc_freq
> 
> Adding delay to guest_tsc and sec/nsec cannot improve precision.
> (And will likely degrade it as kvmclock's frequency is incorrect.)
> 
> > We want the highest precision as possible.
> 
> I agree, which is why we don't want to lose precision in the delay
> guesswork because of gettime64().

Sorry the clock difference is 10ns now. So the guest clock is off by _10 ns_ 
of the host clock.

You are suggesting to use getcrosststamp instead, to drop the (rdtsc() -
guest_tsc) part ?

Please be more verbose.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html