On 02/09/2011 12:29 PM, Joerg Roedel wrote:
With TSC scaling in SVM the tsc-offset needs to be
calculated differently. This patch propagates this
calculation into the architecture specific modules so that
this complexity can be handled there.
Signed-off-by: Joerg Roedel<joerg.roedel@xxxxxxx>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 11 ++++++++++-
arch/x86/kvm/vmx.c | 6 ++++++
arch/x86/kvm/x86.c | 10 +++++-----
4 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9686950..8c40425 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -593,6 +593,7 @@ struct kvm_x86_ops {
void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
bool (*use_virtual_tsc_khz)(struct kvm_vcpu *vcpu);
+ u64 (*compute_tsc_offset)(struct kvm_vcpu *vcpu, u64 target_tsc);
So I've gone over this series and the only issue I see so far is with
this patch, and it doesn't have to do with upstream code, rather with
code I was about to send.
Logically, the compensation done by adjust_tsc_offset should also be
scaled; currently, this happens only for reasons, both of which are
meant to deal with unstable TSCs; since TSC scaling won't happen on
those processors with unstable TSCs, we don't need to worry about it there.
I have an upcoming patch which does complicate things a bit, which deals
with host suspend. In that case, the host TSC goes backwards and the
offsets needs to be recomputed. However, there is no convenient time to
set the offset again; on VMX, the hardware will not yet be set up and so
can't easily write the offset field in the VMCS. We also can't put a
synchronization barrier on all the VCPUs to write the offset before they
start running without getting into a difficult situation with locking.
So instead, the approach I took was to re-use the adjust_tsc_offset
function and accumulate offsets to apply.
For SVM with TSC scaling, the offset to apply as an adjustment in this
case needs to be scaled. Setting guest TSC (gtsc) equal to the new
guest TSC (gstc'), we have:
gtsc = htsc * mult + offset
gstc' = htsc' * mult + offset'
gtsc' = gtsc
offset' = htsc * mult + offset - htsc' * mult
offset' = (htsc - htsc') * mult + offset
so, delta offset needs to = (htsc - htsc') * mult
We will instead be passing (htsc - htsc') as the adjustment value; the
solution seems simple, we have to scale it up as well in the
adjust_tsc_offset function.
However, the problem is that we need a new architecture specific
function or API change because not all call sites for adjust_tsc want to
have adjustments scaled - the call site dealing with tsc_catchup is
actually working in guest cycles already, so should not be scaled again.
We could have separate functions to adjust TSC cycles by either guest or
host cycle amounts, or pass a flag to adjust_tsc_offset indicating
whether the adjustment is to be applied in guest cycles or host cycles.
The resulting API will be slightly asymmetric, as compute_tsc_offset
lets the generic code compute in terms of hardware offsets, but in the
adjustment case, there isn't an easy way to expose the ability to
compute in hardware offset terms.
One slight pity is that we won't be able to resue
svm_compute_tsc_offset, as the applied delta won't be based off a read
of the tsc. I can't really find a better API though, in case offsets
are computed differently on different hardware (such as multiplying
after the offset), then we need a function to convert guest cycles back
to hardware cycles.
As usual, with the TSC code, it is going to require a lot of commenting
to explain this.
Your code in general looks good.
Cheers,
Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html