Re: [RFC 00/10] KVM: Add TMEM host/guest support

Sasha Levin <levinsasha928@xxxxxxxxx> · Mon, 11 Jun 2012 12:26:10 +0200

On Mon, 2012-06-11 at 11:09 +0300, Avi Kivity wrote:
> On 06/08/2012 04:20 PM, Sasha Levin wrote:
> > I re-ran benchmarks in a single user environment to get more stable results, increasing the test files to 50gb each.
> > 
> > First, a test of the good case scenario for KVM TMEM - we'll try streaming a file which compresses well but is bigger than the host RAM size:
> > 
> > First, no KVM TMEM, caching=none:
> > 
> > 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> > 	2048+0 records in
> > 	2048+0 records out
> > 	8589934592 bytes (8.6 GB) copied, 116.309 s, 73.9 MB/s
> > 
> > 	real    1m56.349s
> > 	user    0m0.015s
> > 	sys     0m15.671s
> > 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> > 	2048+0 records in
> > 	2048+0 records out
> > 	8589934592 bytes (8.6 GB) copied, 116.191 s, 73.9 MB/s
> > 
> > 	real    1m56.255s
> > 	user    0m0.018s
> > 	sys     0m15.504s
> > 
> > Now, no KVM TMEM, caching=writeback:
> > 
> > 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> > 	2048+0 records in
> > 	2048+0 records out
> > 	8589934592 bytes (8.6 GB) copied, 122.894 s, 69.9 MB/s
> > 
> > 	real    2m2.965s
> > 	user    0m0.015s
> > 	sys     0m11.025s
> > 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> > 	2048+0 records in
> > 	2048+0 records out
> > 	8589934592 bytes (8.6 GB) copied, 110.915 s, 77.4 MB/s
> > 
> > 	real    1m50.968s
> > 	user    0m0.011s
> > 	sys     0m10.108s
> 
> Strange that system time is lower with cache=writeback.

Maybe because these pages don't get written out immediately? I don't
have a better guess.

> > And finally, KVM TMEM on, caching=none:
> > 
> > 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> > 	2048+0 records in
> > 	2048+0 records out
> > 	8589934592 bytes (8.6 GB) copied, 119.024 s, 72.2 MB/s
> > 
> > 	real    1m59.123s
> > 	user    0m0.020s
> > 	sys     0m29.336s
> > 
> > 	sh-4.2# time dd if=test/zero of=/dev/null bs=4M count=2048
> > 	2048+0 records in
> > 	2048+0 records out
> > 	8589934592 bytes (8.6 GB) copied, 36.8798 s, 233 MB/s
> > 
> > 	real    0m36.950s
> > 	user    0m0.005s
> > 	sys     0m35.308s
> 
> So system time more than doubled compared to non-tmem cache=none.  The
> overhead per page is 17s / (8589934592/4096) = 8.1usec.  Seems quite high.

Right, but consider it didn't increase real time at all.

> 'perf top' while this is running would be interesting.

I'll update later with this.

> > This is a snapshot of kvm_stats while the 2nd run in the KVM TMEM test was going:
> > 
> > 	kvm statistics
> > 
> > 	 kvm_exit                                   1952342   36037
> > 	 kvm_entry                                  1952334   36034
> > 	 kvm_hypercall                              1710568   33948
> 
> In that test, 56k pages/sec were transferred.  Why are we seeing only
> 33k hypercalls/sec?  Shouldn't we have two hypercalls/page (one when
> evicting a page to make some room, one to read the new page from tmem)?

The guest doesn't do eviction at all, in fact - it doesn't know how big
the cache is so even if it wanted to, it couldn't evict pages (the only
thing it does is invalidate pages which have changed in the guest).

This means it only takes one hypercall/page instead of two.

> > 
> > 
> > Now, for the worst case "streaming test". I've tried streaming two files, one which has good compression (zeros), and one full with random bits. Doing two runs for each.
> > 
> > First, the baseline - no KVM TMEM, caching=none:
> > 
> > Zero file:
> > 	12800+0 records in
> > 	12800+0 records out
> > 	53687091200 bytes (54 GB) copied, 703.502 s, 76.3 MB/s
> > 
> > 	real    11m43.583s
> > 	user    0m0.106s
> > 	sys     1m42.075s
> > 	12800+0 records in
> > 	12800+0 records out
> > 	53687091200 bytes (54 GB) copied, 691.208 s, 77.7 MB/s
> > 
> > 	real    11m31.284s
> > 	user    0m0.100s
> > 	sys     1m41.235s
> > 
> > Random file:
> > 	12594+1 records in
> > 	12594+1 records out
> > 	52824875008 bytes (53 GB) copied, 655.778 s, 80.6 MB/s
> > 
> > 	real    10m55.847s
> > 	user    0m0.107s
> > 	sys     1m39.852s
> > 	12594+1 records in
> > 	12594+1 records out
> > 	52824875008 bytes (53 GB) copied, 652.668 s, 80.9 MB/s
> > 
> > 	real    10m52.739s
> > 	user    0m0.120s
> > 	sys     1m39.712s
> > 
> > Now, this is with zcache enabled in the guest (not going through KVM TMEM), caching=none:
> > 
> > Zeros:
> > 	12800+0 records in
> > 	12800+0 records out
> > 	53687091200 bytes (54 GB) copied, 704.479 s, 76.2 MB/s
> > 
> > 	real    11m44.536s
> > 	user    0m0.088s
> > 	sys     2m0.639s
> > 	12800+0 records in
> > 	12800+0 records out
> > 	53687091200 bytes (54 GB) copied, 690.501 s, 77.8 MB/s
> > 
> > 	real    11m30.561s
> > 	user    0m0.088s
> > 	sys     1m57.637s
> 
> zcache appears not to be helping at all; it's just adding overhead.  Is
> even the compressed file too large?
> 
> overhead = 1.4 usec/page.

Correct, I've had to further increase the size of this file so that
zcache would fail here as well. The good case was tested before, here I
wanted to see what will happen with files that wouldn't have much
benefit from both regular caching and zcache.

> > 
> > Random:
> > 	12594+1 records in
> > 	12594+1 records out
> > 	52824875008 bytes (53 GB) copied, 656.436 s, 80.5 MB/s
> > 
> > 	real    10m56.480s
> > 	user    0m0.034s
> > 	sys     3m18.750s
> > 	12594+1 records in
> > 	12594+1 records out
> > 	52824875008 bytes (53 GB) copied, 658.446 s, 80.2 MB/s
> > 
> > 	real    10m58.499s
> > 	user    0m0.046s
> > 	sys     3m23.678s
> 
> Overhead grows to 7.6 usec/page.
> 
> > 
> > Next, with KVM TMEM enabled, caching=none:
> > 
> > Zeros:
> > 	12800+0 records in
> > 	12800+0 records out
> > 	53687091200 bytes (54 GB) copied, 711.754 s, 75.4 MB/s
> > 
> > 	real    11m51.916s
> > 	user    0m0.081s
> > 	sys     2m59.952s
> > 	12800+0 records in
> > 	12800+0 records out
> > 	53687091200 bytes (54 GB) copied, 690.958 s, 77.7 MB/s
> > 
> > 	real    11m31.102s
> > 	user    0m0.082s
> > 	sys     3m6.500s
> 
> Overhead = 6.6 usec/page.
> 
> > 
> > Random:
> > 	12594+1 records in
> > 	12594+1 records out
> > 	52824875008 bytes (53 GB) copied, 656.378 s, 80.5 MB/s
> > 
> > 	real    10m56.445s
> > 	user    0m0.062s
> > 	sys     5m53.236s
> > 	12594+1 records in
> > 	12594+1 records out
> > 	52824875008 bytes (53 GB) copied, 653.353 s, 80.9 MB/s
> > 
> > 	real    10m53.404s
> > 	user    0m0.066s
> > 	sys     5m57.087s
> 
> 
> Overhead = 19 usec/page.
> 
> This is pretty steep.  We have flash storage doing a million iops/sec,
> and here you add 19 microseconds to that.

Might be interesting to test it with flash storage as well...

> > 
> > 
> > This is a snapshot of kvm_stats while this test was running:
> > 
> > 	kvm statistics
> > 
> > 	 kvm_entry                                   168179   20729
> > 	 kvm_exit                                    168179   20728
> > 	 kvm_hypercall                               131808   16409
> 
> The last test was running 19k pages/sec, doesn't quite fit with this
> measurement.  Is the measurement stable or fluctuating?

It's pretty stable when running the "zero" pages, but when switching to
random files it somewhat fluctuates.

> > 
> > And finally, KVM TMEM enabled, with caching=writeback:
> 
> I'm not sure what the point of this is?  You have two host-caching
> mechanisms running in parallel, are you trying to increase overhead
> while reducing effective cache size?

I thought that you've asked for this test:

On Wed, 2012-06-06 at 16:24 +0300, Avi Kivity wrote:
> while cache=writeback with cleancache enabled in the host should
> give the same effect, but with the extra hypercalls, but with an extra
> copy to manage the host pagecache.  It would be good to see results for all three settings.

> My conclusion is that the overhead is quite high, but please double
> check my numbers, maybe I missed something obvious.

I'm not sure what options I have to lower the overhead here, should I be
using something other than hypercalls to communicate with the host?

I know that there are several things being worked on from zcache
perspective (WasActive, batching, etc), but is there something that
could be done within the scope of kvm-tmem?

It would be interesting in seeing results for Xen/TMEM and comparing
them to these results.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html