Re: [RFC 00/10] KVM: Add TMEM host/guest support

Avi Kivity <avi@xxxxxxxxxx> · Mon, 11 Jun 2012 11:09:50 +0300

On 06/08/2012 04:20 PM, Sasha Levin wrote:
> I re-ran benchmarks in a single user environment to get more stable results, increasing the test files to 50gb each.
> 
> First, a test of the good case scenario for KVM TMEM - we'll try streaming a file which compresses well but is bigger than the host RAM size:
> 
> First, no KVM TMEM, caching=none:
> 
> 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> 	2048+0 records in
> 	2048+0 records out
> 	8589934592 bytes (8.6 GB) copied, 116.309 s, 73.9 MB/s
> 
> 	real    1m56.349s
> 	user    0m0.015s
> 	sys     0m15.671s
> 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> 	2048+0 records in
> 	2048+0 records out
> 	8589934592 bytes (8.6 GB) copied, 116.191 s, 73.9 MB/s
> 
> 	real    1m56.255s
> 	user    0m0.018s
> 	sys     0m15.504s
> 
> Now, no KVM TMEM, caching=writeback:
> 
> 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> 	2048+0 records in
> 	2048+0 records out
> 	8589934592 bytes (8.6 GB) copied, 122.894 s, 69.9 MB/s
> 
> 	real    2m2.965s
> 	user    0m0.015s
> 	sys     0m11.025s
> 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> 	2048+0 records in
> 	2048+0 records out
> 	8589934592 bytes (8.6 GB) copied, 110.915 s, 77.4 MB/s
> 
> 	real    1m50.968s
> 	user    0m0.011s
> 	sys     0m10.108s

Strange that system time is lower with cache=writeback.

> 
> And finally, KVM TMEM on, caching=none:
> 
> 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> 	2048+0 records in
> 	2048+0 records out
> 	8589934592 bytes (8.6 GB) copied, 119.024 s, 72.2 MB/s
> 
> 	real    1m59.123s
> 	user    0m0.020s
> 	sys     0m29.336s
> 
> 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> 	2048+0 records in
> 	2048+0 records out
> 	8589934592 bytes (8.6 GB) copied, 36.8798 s, 233 MB/s
> 
> 	real    0m36.950s
> 	user    0m0.005s
> 	sys     0m35.308s

So system time more than doubled compared to non-tmem cache=none.  The
overhead per page is 17s / (8589934592/4096) = 8.1usec.  Seems quite high.

'perf top' while this is running would be interesting.

> 
> This is a snapshot of kvm_stats while the 2nd run in the KVM TMEM test was going:
> 
> 	kvm statistics
> 
> 	 kvm_exit                                   1952342   36037
> 	 kvm_entry                                  1952334   36034
> 	 kvm_hypercall                              1710568   33948

In that test, 56k pages/sec were transferred.  Why are we seeing only
33k hypercalls/sec?  Shouldn't we have two hypercalls/page (one when
evicting a page to make some room, one to read the new page from tmem)?

> 
> 
> Now, for the worst case "streaming test". I've tried streaming two files, one which has good compression (zeros), and one full with random bits. Doing two runs for each.
> 
> First, the baseline - no KVM TMEM, caching=none:
> 
> Zero file:
> 	12800+0 records in
> 	12800+0 records out
> 	53687091200 bytes (54 GB) copied, 703.502 s, 76.3 MB/s
> 
> 	real    11m43.583s
> 	user    0m0.106s
> 	sys     1m42.075s
> 	12800+0 records in
> 	12800+0 records out
> 	53687091200 bytes (54 GB) copied, 691.208 s, 77.7 MB/s
> 
> 	real    11m31.284s
> 	user    0m0.100s
> 	sys     1m41.235s
> 
> Random file:
> 	12594+1 records in
> 	12594+1 records out
> 	52824875008 bytes (53 GB) copied, 655.778 s, 80.6 MB/s
> 
> 	real    10m55.847s
> 	user    0m0.107s
> 	sys     1m39.852s
> 	12594+1 records in
> 	12594+1 records out
> 	52824875008 bytes (53 GB) copied, 652.668 s, 80.9 MB/s
> 
> 	real    10m52.739s
> 	user    0m0.120s
> 	sys     1m39.712s
> 
> Now, this is with zcache enabled in the guest (not going through KVM TMEM), caching=none:
> 
> Zeros:
> 	12800+0 records in
> 	12800+0 records out
> 	53687091200 bytes (54 GB) copied, 704.479 s, 76.2 MB/s
> 
> 	real    11m44.536s
> 	user    0m0.088s
> 	sys     2m0.639s
> 	12800+0 records in
> 	12800+0 records out
> 	53687091200 bytes (54 GB) copied, 690.501 s, 77.8 MB/s
> 
> 	real    11m30.561s
> 	user    0m0.088s
> 	sys     1m57.637s

zcache appears not to be helping at all; it's just adding overhead.  Is
even the compressed file too large?

overhead = 1.4 usec/page.

> 
> Random:
> 	12594+1 records in
> 	12594+1 records out
> 	52824875008 bytes (53 GB) copied, 656.436 s, 80.5 MB/s
> 
> 	real    10m56.480s
> 	user    0m0.034s
> 	sys     3m18.750s
> 	12594+1 records in
> 	12594+1 records out
> 	52824875008 bytes (53 GB) copied, 658.446 s, 80.2 MB/s
> 
> 	real    10m58.499s
> 	user    0m0.046s
> 	sys     3m23.678s

Overhead grows to 7.6 usec/page.

> 
> Next, with KVM TMEM enabled, caching=none:
> 
> Zeros:
> 	12800+0 records in
> 	12800+0 records out
> 	53687091200 bytes (54 GB) copied, 711.754 s, 75.4 MB/s
> 
> 	real    11m51.916s
> 	user    0m0.081s
> 	sys     2m59.952s
> 	12800+0 records in
> 	12800+0 records out
> 	53687091200 bytes (54 GB) copied, 690.958 s, 77.7 MB/s
> 
> 	real    11m31.102s
> 	user    0m0.082s
> 	sys     3m6.500s

Overhead = 6.6 usec/page.

> 
> Random:
> 	12594+1 records in
> 	12594+1 records out
> 	52824875008 bytes (53 GB) copied, 656.378 s, 80.5 MB/s
> 
> 	real    10m56.445s
> 	user    0m0.062s
> 	sys     5m53.236s
> 	12594+1 records in
> 	12594+1 records out
> 	52824875008 bytes (53 GB) copied, 653.353 s, 80.9 MB/s
> 
> 	real    10m53.404s
> 	user    0m0.066s
> 	sys     5m57.087s

Overhead = 19 usec/page.

This is pretty steep.  We have flash storage doing a million iops/sec,
and here you add 19 microseconds to that.

> 
> 
> This is a snapshot of kvm_stats while this test was running:
> 
> 	kvm statistics
> 
> 	 kvm_entry                                   168179   20729
> 	 kvm_exit                                    168179   20728
> 	 kvm_hypercall                               131808   16409

The last test was running 19k pages/sec, doesn't quite fit with this
measurement.  Is the measurement stable or fluctuating?

> 
> And finally, KVM TMEM enabled, with caching=writeback:

I'm not sure what the point of this is?  You have two host-caching
mechanisms running in parallel, are you trying to increase overhead
while reducing effective cache size?

My conclusion is that the overhead is quite high, but please double
check my numbers, maybe I missed something obvious.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html