Re: [RFC 0/4] KVM in-kernel PM Timer implementation

Peter Lieven <pl@xxxxxxx> · Tue, 21 Feb 2012 19:10:58 +0100

On 15.12.2010 12:53, Ulrich Obergfell wrote:
----- "Anthony Liguori"<anthony@xxxxxxxxxxxxx>  wrote:

On 12/14/2010 06:09 AM, Ulrich Obergfell wrote:
[...]

Parts 1 thru 4 of this RFC contain experimental source code which
I recently used to investigate the performance benefit. In a Linux
guest, I was running a program that calls gettimeofday() 'n' times
in a loop (the PM Timer register is read during each call). With
in-kernel PM Timer, I observed a significant reduction of program
execution time.

I've played with this in the past.  Can you post real numbers,
preferably, with a real work load?

Anthony,

I only experimented with a gettimeofday() loop. With this test scenario
I observed that in-kernel PM Timer reduced the program execution time to
roughly half of the execution time that it takes with userspace PM Timer.
Please find some example results below (these results were obtained while
the host was not busy). The relative difference of in-kernel PM Timer
versus userspace PM Timer is high, whereas the absolute difference per
call appears to be low. So, the benefit much depends on how frequently
gettimeofday() is called in a real work load. I don't have any numbers
from a real work load. When I began working on this, I was motivated by
the fact that the Linux kernel itself provides an optimization for the
gettimeofday() call ('vxtime'). So, from this I presumed that there
would be real work loads which would benefit from the optimization of
the gettimeofday() call (otherwise, why would we have 'vxtime' ?).
Of course, 'vxtime' is not related to PM based time keeping. However,
the experimental code shows an approach to optimize gettimeofday() in
KVM virtual machines.

Regards,

Uli

- host:

# grep "model name" /proc/cpuinfo | sort | uniq -c
       8 model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz

# uname -r
2.6.37-rc4

- guest:

# grep "model name" /proc/cpuinfo | sort | uniq -c
       4 model name : QEMU Virtual CPU version 0.13.50

- test program ('gtod.c'):

#include<sys/time.h>
#include<stdlib.h>

struct timeval tv;

main(int argc, char *argv[])
{
	int i = atoi(argv[1]);
	while (i-->  0)
		gettimeofday(&tv, NULL);
}

- example results with in-kernel PM Timer:

# for i in 1 2 3
do
time ./gtod 25000000
done
real	0m44.302s
user	0m1.090s
sys	0m43.163s

real	0m44.509s
user	0m1.100s
sys	0m43.393s

real	0m45.290s
user	0m1.160s
sys	0m44.123s

# for i in 10000000 50000000 100000000
do
time ./gtod $i
done
real	0m17.981s
user	0m0.810s
sys	0m17.157s

real	1m27.253s
user	0m1.930s
sys	1m25.307s

real	2m51.801s
user	0m3.359s
sys	2m48.384s

- example results with userspace PM Timer:

# for i in 1 2 3
do
time ./gtod 25000000
done
real	1m24.185s
user	0m2.000s
sys	1m22.168s

real	1m23.508s
user	0m1.750s
sys	1m21.738s

real	1m24.437s
user	0m1.900s
sys	1m22.517s

# for i in 10000000 50000000 100000000
do
time ./gtod $i
done
real	0m33.479s
user	0m0.680s
sys	0m32.785s

real	2m50.831s
user	0m3.389s
sys	2m47.405s

real	5m42.304s
user	0m7.319s
sys	5m34.919s
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

i currently analyze a performance regression togehter with Gleb where a 
Windows 7 / Win2008R2 VM hammers the pmtimer approx. 15000 times/s during
I/O. the performance thus is very bad and the cpu is at 100%.

has anyone made any further work on the in-kernel pm timer or a full 
implementation?

would it be possible to rebase this old experimental patch to see if it 
helps in the performance regression we came across?

thank you,
peter
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html