Re: [RFC 00/10] KVM: Add TMEM host/guest support

Avi Kivity <avi@xxxxxxxxxx> · Mon, 11 Jun 2012 20:06:47 +0300

On 06/11/2012 06:44 PM, Dan Magenheimer wrote:
> > >> This is pretty steep.  We have flash storage doing a million iops/sec,
> > >> and here you add 19 microseconds to that.
> > >
> > > Might be interesting to test it with flash storage as well...
>
> Well, to be fair, you are comparing a device that costs many
> thousands of $US to a software solution that uses idle CPU
> cycles and no additional RAM.

You don't know that those cycles are idle.  And when in fact you have no
additional RAM, those cycles are wasted to no benefit.

The fact that I/O is being performed doesn't mean that we can waste
cpu.  Those cpu cycles can be utilized by other processes on the same
guest or by other guests.

>  
> > Batching will drastically reduce the number of hypercalls.
>
> For the record, batching CAN be implemented... ramster is essentially
> an implementation of batching where the local system is the "guest"
> and the remote system is the "host".  But with ramster the
> overhead to move the data (whether batched or not) is much MUCH
> worse than a hypercall and ramster still shows performance advantage.

Sure, you can buffer pages in memory but then you add yet another copy. 
I know you think copies are cheap but I disagree.

> So, IMHO, one step at a time.  Get the foundation code in
> place and tune it later if a batching implementation can
> be demonstrated to improve performance sufficiently.

Sorry, no, first demonstrate no performance regressions, then we can
talk about performance improvements.

> > A different
> > alternative is to use ballooning to feed the guest free memory so it
> > doesn't need to hypercall at all.  Deciding how to divide free memory
> > among the guests is hard (but then so is deciding how to divide tmem
> > memory among guests), and adding dedup on top of that is also hard (ksm?
> > zksm?).  IMO letting the guest have the memory and manage it on its own
> > will be much simpler and faster compared to the constant chatting that
> > has to go on if the host manages this memory.
>
> Here we disagree, maybe violently.  All existing solutions that
> try to do manage memory across multiple tenants from an "external
> memory manager policy" fail miserably.  Tmem is at least trying
> something new by actively involving both the host and the guest
> in the policy (guest decides which pages, host decided how many)
> and without the massive changes required for something like
> IBM's solution (forgot what it was called).  

cmm2

> Yes, tmem has
> overhead but since the overhead only occurs where pages
> would otherwise have to be read/written from disk, the
> overhead is well "hidden".

The overhead is NOT hidden.  We spent many efforts to tune virtio-blk to
reduce its overhead, and now you add 6-20 microseconds per page.  A
guest may easily be reading a quarter million pages per second, this
adds up very fast - at the upper end you're consuming 5 vcpus just for tmem.

Note that you don't even have to issue I/O to get a tmem hypercall
invoked.  Alllocate a ton of memory and you get cleancache calls for
each page that passes through the tail of the LRU.  Again with the upper
end, allocating a gigabyte can now take a few seconds extra.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html