Re: KVM timekeeping and TSC virtualization

Zachary Amsden <zamsden@xxxxxxxxxx> · Fri, 20 Aug 2010 13:24:40 -1000

On 08/20/2010 03:26 AM, David S. Ahern wrote:

On 08/20/10 02:07, Zachary Amsden wrote:

This patch set implements full TSC virtualization, with both
trapping and passthrough modes, and intelligent mode switching.
As a result, TSC will never go backwards, we are stable against
guest re-calibration attempts, VM reset, and migration.  For guests
which require it, the TSC khz can even be preserved on migration
to a new host.

The TSC will never be trapped on UP systems unless the host TSC
actually runs faster than the guest; other conditions, including
bad hardware and changing speeds are accomodated by using catchup
mode to keep the guest passthrough TSC in line with the host clock.

What's the overhead of trapping TSC reads for Nehalem-type processors?

gettimeofday() in guests is the biggest performance problem with KVM for
me, especially for older OSes like RHEL4 which is a supported OS for
another 2 years. Even with RHEL5, 32-bit, I had to force kvmclock off to
get the VM to run reliably:

http://article.gmane.org/gmane.comp.emulators.kvm.devel/51017/match=kvmclock+rhel5.5

Correctness is the biggest timekeeping problem with KVM for me.  The 
fact that you had to force kvmclock off is evidence of that.  Slightly 
slower applications are fine.  Broken ones are not acceptable.

TSC will not be trapped with kvmclock, and the bug you hit with RHEL5 
kvmclock has since been fixed.  As you can see, it is not a simple and 
straightforward issue to get all the issues sorted out.

Also, TSC will not be trapped with UP VMs, only SMP.  If you seriously 
believe RHEL4 will perform better as an SMP guest than several instances 
of coordinated UP guests, you would worry about this issue.  I don't.  
The amount of upstream scalability and performance work done since that 
timeframe is enormous, to the point that it's entirely plausible that 
KVM governed UP RHEL4 guests as a cluster are faster than a RHEL4 SMP host.

So the answer is - it depends.  Hardware is always getting faster, and 
trap / exit cost is going down.   Right now, it is anywhere from a few 
hundred to multiple thousands of cycles, depending on your hardware.  I 
don't have an exact benchmark number I can quote, although in a couple 
of hours, I probably will.  I'll guess 3,000 cycles.

I agree, gettimeofday is a huge issue, for poorly written applications.  
Not that this means we won't speed it up, in fact, I have already done 
quite a bit of work on ways to reduce the exit cost.  Let's, however, 
get things correct before trying to make them aggressively fast.

Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html