[...] >> >> to give us some idea how much performance we would gain from each approach? Thoughput should be completely unaffected anyway, since virtio just coalesces kicks internally. >> > >> > Latency is dominated by the scheduling latency. >> > This means virtio-net is not the best benchmark. >> >> So what is a good benchmark? > > E.g. ping pong stress will do but need to look at CPU utilization, > that's what is affected, not latency. > >> Is there any difference in speed at all? I strongly doubt it. One of virtio's main points is to reduce the number of kicks. > > For this stage of the project I think microbenchmarks are more appropriate. > Doubling the price of exit is likely to be measureable. 30 cycles likely > not ... > I don't quite understand this point here. If we don't have anything real-world where we can measure a decent difference, then why are we doing this? I would agree with Alex that the three test scenarios proposed by him should be tried out before adding this complexity, measured in CPU utilization or latency as you wish. FWIW, ARM always uses MMIO and provides hardware decoding of all sane (not user register-writeback) instruction, but the hypercall vs. mmio looks like this: hvc: 4,917 mmio_kernel: 6,248 But I doubt that an hvc wrapper around mmio decoding would take care of all this difference, because the mmio operation needs to do other work not realated to emulating the instruction in software, which you'd have to do for an hvc anyway (populate kvm_mmio structure etc.) -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html