On 02/17/2010 08:07 PM, Alexander Graf wrote:
On 17.02.2010, at 17:34, Avi Kivity wrote:
On 02/17/2010 06:23 PM, Alexander Graf wrote:
On 17.02.2010, at 17:03, Avi Kivity wrote:
On 02/17/2010 04:56 PM, Alexander Graf wrote:
So I changed to code according to your input by making all FPU calls explicit, getting rid of all binary patching.
On the PowerStation again I'm running this code (simplified to the important instructions) using kvmctl:
li r2, 0x1234
std r2, 0(r1)
lfd f3, 0(r1)
lfd f4, 0(r1)
do_mul:
fmul f0, f3, f4
b do_mul
With the following kvm_stat output:
dec 2236 53
exits 60797802 1171403
ext_intr 379 4
halt_wakeup 0 0
inst_emu 60795247 1171344
ld 60795132 1171348
So I'm getting 1171403 fmul operations per second. And that's even with non-optimized instruction fetching. Not bad.
It's a large number, but won't real hardware be three orders of magnitude faster?
Yes, it would. But we don't have to care. The only thing we need to worry about is being fast enough to emulate enough FPU instructions actually used in normal guests so the guest runs in full speed. And 1000k> 250k, so we can do that apparently, leaving some spare cycles for non-fpu instructions.
I'm sure 250k isn't representative of a floating point intensive program (but maybe there aren't fpu intensive applications on that cpu).
Now you made me check how fast the real hw is. I get about 65,000,000 fmul operations per second on it.
That's surprisingly low.
So we're 65x slower on a PowerStation. And that's for a tight FPU only loop. I'm still not convinced we're running into major problems.
Well, it's up to you. I just hope we don't end up underperforming due
to this.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html