On Thu, Apr 28, 2011 at 08:00:57PM -0700, Zachary Amsden wrote: > On 04/28/2011 01:20 PM, Joerg Roedel wrote: >> This code checks how many guest tsc cycles have passed since this vCPU >> was de-scheduled last time (and before it is running again). So since >> the vCPU hasn't run in the meantime it had no chance to change its TSC. >> Further, the other parameters like the TSC offset and the scaling >> multiplier havn't changed too, so the only variable in the guest-tsc >> calculation is the host-tsc. >> So this calculation using the guest-tsc can detect backwards going >> host-tsc as good as the old one. The benefit here is that we can feed >> consistent values into adjust_tsc_offset(). >> > > While true, this is more complex than the original code. The original > code here doesn't try to actually compensate for the guest TSC > difference - instead what it does is NULL any discovered host TSC delta: Why should KVM care about the host-tsc going backwards when the guest-tsc moves forward? I think the bottom-line of this code is to make sure the guest-tsc does not jump around (or even jumps backwards). I also don't agree that this is additional complexity. With these changes we were able to get rid of the last_host_tsc variable which is actually a simplification. As I see it, there are two situations here: 1) On hosts without tsc-scaling the value of tsc_delta calculated from the guest-tsc values is the same as when calculated with host-tsc values, because the guest-tsc only differs by an offset from the host-tsc. 2) On hosts with tsc-scaling these two tsc_delta values may be different. If the guest has a different tsc-freq as the host, we can't simply adjust the tsc by an offset calculated from the host-tsc values. So tsc_delta has to be calculated using the guest-tsc. > > if (tsc_delta < 0) > mark_tsc_unstable("KVM discovered backwards TSC"); > if (check_tsc_unstable()) { > kvm_x86_ops->adjust_tsc_offset(vcpu, -tsc_delta); > vcpu->arch.tsc_catchup = 1; > } > kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu); > > Erasing that delta also erases elapsed time since the VCPU has last been > run, which isn't desirable, so it then sets tsc_catchup mode, which will > restore the proper TSC. The request here triggers code which later > updates the TSC offset again. I just looked again at the tsc_catchup thing. Since the clock-update is forced in the code above, and the clock-update-code adjusts the tsc itself if necessary, is it really necessary to do this here at all? > I left compute_guest_tsc in place to recompute time in guest units here, > even if the underlying hardware rate changes. > > /* > * We may have to catch up the TSC to match elapsed wall clock > * time for two reasons, even if kvmclock is used. > * 1) CPU could have been running below the maximum TSC rate > * 2) Broken TSC compensation resets the base at each VCPU > * entry to avoid unknown leaps of TSC even when running > * again on the same CPU. This may cause apparent elapsed > * time to disappear, and the guest to stand still or run > * very slowly. > */ > if (vcpu->tsc_catchup) { > u64 tsc = compute_guest_tsc(v, kernel_ns); > if (tsc > tsc_timestamp) { > kvm_x86_ops->adjust_tsc_offset(v, tsc - > tsc_timestamp); > tsc_timestamp = tsc; > } > } > > So yeah, the code is getting pretty complex but we'd like to avoid that > as much as possible - so I would prefer to have the hardware backwards > compensation separate from the guest rate computation by doing this: > > step 1) remove any backwards hardware TSC delta (in hardware units) > step 2) recompute guest TSC from a stable clock (gotten from kernel_ns) > and apply adjustment (in guest units) > > So it appears you can just share most of the logic of guest TSC catchup > mode. > > What do you think? With the compensation done in kvm_guest_time_update I see no reason anymore to adjust the tsc in vcpu_load at all. So if we can remove the adjustment from there we can switch back to host-tsc there. The question still is whether this needs to be done in KVM. For older kernels Jan will take patches that handle this in kvm-kmod. So this code can probably be removed from vcpu_load. The benefit is that we don't need to re-introduce the last_host_tsc. If we still need to call adjust_tsc_offset() in vcpu_load then we have to do the calculation with the guest_tsc because this gives us a tsc_delta in units that adjust_tsc_offset expects. Regards, Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html