> -----Original Message----- > From: kvmarm-bounces@xxxxxxxxxxxxxxxxxxxxx [mailto:kvmarm- > bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Christoffer Dall > Sent: Thursday, October 11, 2012 8:14 PM > To: Antonios Motakis > Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: VirtIO vs Emulation Netperf benchmark results > > On Thu, Oct 11, 2012 at 6:12 AM, Antonios Motakis > <a.motakis@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Sorry for a repost, pressed reply instead of reply to all. > > > > On Thu, Oct 11, 2012 at 11:55 AM, Alexander Graf <agraf@xxxxxxx> wrote: > >> > >> > >> > >> On 11.10.2012, at 11:46, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > >> > >> > On 10/10/12 19:58, Alexander Graf wrote: > >> >> > >> >> > >> >> On 10.10.2012, at 20:52, Christoffer Dall > >> >> <c.dall@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> >> > >> >>> On Wed, Oct 10, 2012 at 2:50 PM, Alexander Graf <agraf@xxxxxxx> > wrote: > >> >>>> > >> >>>> > >> >>>> On 10.10.2012, at 20:39, Alexander Spyridakis > >> >>>> <a.spyridakis@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> >>>> > >> >>>> For your information, with the latest developments related to > >> >>>> VirtIO, I run netperf a couple of times to see the exact > >> >>>> standing of network performance on the guests. > >> >>>> > >> >>>> The test was to run netperf -H "ip of LAN node", which tests TCP > >> >>>> traffic for > >> >>>> 10 seconds. > >> >>>> > >> >>>> x86 - x86: ~96 Mbps - reference between two different computers > >> >>>> ARM Host - x86: ~80 Mbps ARM Guest - x86: ~ 2 Mbps - > >> >>>> emulation ARM Guest - x86: ~74 Mbps - VirtIO > >> >>>> > >> >>>> From these we conclude that: > >> >>>> > >> >>>> As expected x86 to x86 communication can reach the limit of the > >> >>>> 100 Mbps LAN. > >> >>>> The ARM board seems to not be capable of the LAN. > >> >>>> Network emulation in QEMU is more than just slow (expected). > >> >>>> > >> >>>> > >> >>>> Why is this expected? This performance drop is quite terrifying. > >> >>>> > >> >>> > >> >>> I think he means expected as in, we already know we have this > >> >>> terrifying problem. I'm looking into this right now, and I > >> >>> believe Marc is also on this. > >> >> > >> >> Ah, good :). Since you are on a dual-core machine with lots of > >> >> traffic, you should get almost no vmexits for virtio queue processing. > >> >> > >> >> Since we know that this is a fast case, the big difference to > >> >> emulated devices are the exits. So I'd search there :). > >> > > >> > There's a number of things we're aware of: > >> > > >> > - The emulated device is pure PIO. Using this kind of device is > >> > always going to suck, and even more on KVM. We could use a "less > braindead" > >> > model (some DMA capable device), but as we depart from the real VE > >> > board, I'd rather go virtio all the way. > >> > >> Well, you should try to get comparable performance numbers. If that > >> means exposing that braindead device on an x86 vm and turning off > >> coalesced mmio, so be it. > >> > >> The alternative is to expose PCI into the guest, even when it's only > >> half-working. It's not meant for production, but to get performance > >> comparison data that you can sanity check against x86 to see if (and > >> what) you're doing wrong. > >> > >> > > >> > - Our exit path is painfully long. We could maybe make it more > >> > efficient by being more lazy, and delay the switch of some of the > >> > state until we get preempted (VFP state, for example). Not sure how > >> > much of an improvement this would make, though. > >> > >> Lazy FP switch bought me quize a significant speedup on ppc. It won't > >> help you here though. User space exits need to restore that state > regardless. > >> Unless the guest hasn't used FP. Then you can save yourself both ways > >> of FP state switches. > >> > > > > VFP switches are already being done lazily however, only when the > > guest actually uses some FP or Advanced SIMD instructions, and not on > > entry. In fact, when we lazy switch the VFP registers, we return > > directly from Hypermode interrupt context to the guest, without really > > giving the chance to the host to do much. We do not go all the way back to > the ioctl loop. > > > > However, on the next vm exit we will switch back to the host state > > regardless of whether the host is going to use VFP or not, but I don't > > think optimizing that would offer any big benefits, especially for I/O. > > > > Of course things could always be improved, for example we could try > > handing the VFP/NEON control registers separately and emulate them, > > instead of doing a complete switch every time the guest does something > > simple, e.g. only to check whether VFP is enabled. But we would need > > some numbers to know whether this will do things better or worse, since > this implies another exit. > > > > Best regards, > > Antonios > > > >> > >> > > >> > - Our memory bandwidth is reduced by the number of TLB entries we > >> > waste by not using section mappings instead of 4kB pages. Running > >> > hackbench on a guest shows quite a slowdown that should mostly go > >> > away if/when userspace switches to huge pages as backing store. I > >> > expect virtio to suffer from the same problem. > >> > >> That one should be in a completely different ballpark. I'd be very > >> surprised if you get more than 10% slowdowns in TLB miss intensive > >> workloads. Definitely not as low of a hanging fruit as we see here. > >> > >> Alex > >> > >> > > >> > Once we've addressed these points, I expect the IO performance to > >> > become better. At least by some margin. > >> > > > I ran perf a bit yesterday and it seems we spend approx. 5% of the vcpu > thread's time on vgic save/restore. I don't know if this can be optimized at all > though. > [[ss]] Hi Chris, Can you tell me how you ran perf to get this level of details? When I ran it I only get a very high level summary, that is not very useful. Thanks Senthil > -Christoffer > _______________________________________________ > kvmarm mailing list > kvmarm@xxxxxxxxxxxxxxxxxxxxx > https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm