On Sun, Apr 7, 2013 at 12:41 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > On Thu, Apr 04, 2013 at 04:32:01PM -0700, Christoffer Dall wrote: >> [...] >> >> >> >> to give us some idea how much performance we would gain from each approach? Thoughput should be completely unaffected anyway, since virtio just coalesces kicks internally. >> >> > >> >> > Latency is dominated by the scheduling latency. >> >> > This means virtio-net is not the best benchmark. >> >> >> >> So what is a good benchmark? >> > >> > E.g. ping pong stress will do but need to look at CPU utilization, >> > that's what is affected, not latency. >> > >> >> Is there any difference in speed at all? I strongly doubt it. One of virtio's main points is to reduce the number of kicks. >> > >> > For this stage of the project I think microbenchmarks are more appropriate. >> > Doubling the price of exit is likely to be measureable. 30 cycles likely >> > not ... >> > >> I don't quite understand this point here. If we don't have anything >> real-world where we can measure a decent difference, then why are we >> doing this? I would agree with Alex that the three test scenarios >> proposed by him should be tried out before adding this complexity, >> measured in CPU utilization or latency as you wish. > > Sure, plan to do real world benchmarks for PV MMIO versus PIO as well. > I don't see why I should bother implementing hypercalls given that the > kvm maintainer says they won't be merged. > the implementation effort to simply measure the hypercall performance should be minimal, no? If we can measure a true difference in performance, I'm sure we can revisit the issue of what will be merged and what won't be, but until we have those numbers it's all speculation. >> FWIW, ARM always uses MMIO and provides hardware decoding of all sane >> (not user register-writeback) instruction, but the hypercall vs. mmio >> looks like this: >> >> hvc: 4,917 >> mmio_kernel: 6,248 > > So 20% difference? That's not far from what happens on my intel laptop: > vmcall 1519 > outl_to_kernel 1745 > 10% difference here. > >> >> But I doubt that an hvc wrapper around mmio decoding would take care >> of all this difference, because the mmio operation needs to do other >> work not realated to emulating the instruction in software, which >> you'd have to do for an hvc anyway (populate kvm_mmio structure etc.) >> > > Instead of speculating, someone with relevant hardware > could just try this, but kvm unittest doesn't seem to have arm support > at the moment. Anyone working on this? > We have a branch called kvm-selftest that replicates much of the functionality, which is what I run to get these measurements. I can port it over to unittest at some point, but I'm not active working on that. I can measure it, but we have bigger fish to fry on the ARM side right now, so it'll be a while until I get to that. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html