Re: [Qemu-devel] [RFC v4 00/58] Memory API

Avi Kivity <avi@xxxxxxxxxx> · Wed, 20 Jul 2011 17:45:56 +0300

On 07/20/2011 05:31 PM, Anthony Liguori wrote:
The VGA device doesn't know *if* it is mapped. It can be obstructed by
the chipset and by SMM. Other chipsets we emulate may support multiple
VGA cards.

The i440fx can support multiple VGA cards just fine.

Legacy region accesses are always routed by the PCI bus to the first 
PCI device that identifies itself as a graphics card.

The card is very well aware of the fact that it is getting legacy VGA 
accesses or not because only one card can register for this area.

But the current API doesn't support it.  The card talks to the system 
address space directly.

The new API can support it just fine.  But that requires having 
coalesced mmio in the API.

The e1000 does coalesced I/O for it's memory registers. But it's
dubious how much this actually matters anymore. The original claim was
a 10% boost with iperf.

The e1000 is not performance competitive with virtio-net though so it
certainly is reasonable to assume that noone would notice if we
removed coalesced I/O from the e1000.

The e1000 NIC is the best we have for guests that don't support virtio.
It's not reasonable to reduce its performance.

So let's talk about real numbers.  This is netperf with a default 
invocation from guest to host.  All numbers are MB/sec

rtl8139
-------
119.45
118.12

e1000 w/coalesced mmio
----------------------
425.93
424.08

e1000 w/o coalesced mmio
------------------------
419.13
413.83

virtio-net
----------
4330.52
4419.90

So removing coalesced MMIO from the e1000 results in a massive 0.7% 
slowdown :-)

And while the e100 is > 100% faster than the rtl8139, it's still an 
order of magnitude slower the userspace virtio-net.

Fine, we can drop coalesced mmio from e1000.  But not from vga.

I'm confident that the e1000 could be improved if someone modified it 
to optimally use the new netdev interfaces.  But no one cares that 
much about the performance of the e1000.  And if we dropped coalesced 
MMIO support for the e1000, no one would notice.

Exits costs have changed dramatically over the years.  Optimizations 
that made sense with P4 class hardware don't necessary make sense 
these days.  QEMU has also changed a lot so bottle necks are no longer 
where they used to be.

We either support coalesced mmio well, or not at all. Even if the API
has only one user, that doesn't excuse doing it badly.

It's not at all that black and white.  We need to carefully choose 
what we model and then have the flexibility to break those models in 
the name of performance.

If we try to make everything fit elegantly into a model, we'll end up 
with something that's overly complex just to accommodate a single 
user.  That's my general concern with where we're going here.

I don't think it's too bad and as I said, I don't object to it in it's 
current form.  But I think it could be simplified.  Even in it's 
current non-simple form, it's better than what we currently have.

I'm interested in how it could be simplified.  It's complicated for me 
as well.  But I don't think a side band API is possible.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html