2017-08-18 16:46+0800, Jason Wang: > > > On 2017年08月16日 22:10, Michael S. Tsirkin wrote: > > On Wed, Aug 16, 2017 at 03:34:54PM +0200, Paolo Bonzini wrote: > > > Microsoft pointed out privately to me that KVM's handling of > > > KVM_FAST_MMIO_BUS is invalid. Using skip_emulation_instruction is invalid > > > in EPT misconfiguration vmexit handlers, because neither EPT violations > > > nor misconfigurations are listed in the manual among the VM exits that > > > set the VM-exit instruction length field. > > > > > > While physical processors seem to set the field, this is not architectural > > > and is just a side effect of the implementation. I couldn't convince > > > myself of any condition on the exit qualification where VM-exit > > > instruction length "has" to be defined; there are no trap-like VM-exits > > > that can be repurposed; and fault-like VM-exits such as descriptor-table > > > exits provide no decoding information. So I don't really see any way > > > to keep the full speedup. > > > > > > What we can do is use EMULTYPE_SKIP; it only saves 200 clock cycles > > > because computing the physical RIP and reading the instruction is > > > expensive, but at least the eventfd is signaled before entering the > > > emulator. This saves on latency. While at it, don't check breakpoints > > > when skipping the instruction, as presumably any side effect has been > > > exposed already. > > > > > > Adding a hypercall or MSR write that does a fast MMIO write to a physical > > > address would do it, but it adds hypervisor knowledge in virtio, including > > > CPUID handling. So it would be pretty ugly in the guest-side implementation, > > > but if somebody wants to do it and the virtio side is acceptable to the > > > virtio maintainers, I am okay with it. > > > > > > Cc: Michael S. Tsirkin<mst@xxxxxxxxxx> > > > Cc:stable@xxxxxxxxxxxxxxx > > > Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae > > > Suggested-by: Radim Krčmář<rkrcmar@xxxxxxxxxx> > > > Signed-off-by: Paolo Bonzini<pbonzini@xxxxxxxxxx> > > Jason (cc) who worked on the original optimization said he can > > work to test the performance impact. > > I see regressions on both latency and cpu utilization through netperf TCP_RR > test: > > pkt_size/sessions/+transaction_rate%/+per_cpu_transaction_rate% > 1/ 1/ +0%/ -5% > 1/ 25/ -1%/ -2% > 1/ 50/ -9%/ -10% > 64/ 1/ -3%/ -9% > 64/ 25/ 0%/ -2% > 64/ 50/ -10%/ -11% > 256/ 1/ -10%/ -17% > 256/ 25/ -11%/ -12% > 256/ 50/ -9%/ -11% Might be noticeable ... I'm ok with the hypervisor detection workaround. Still, we will need a replacement mechanism for virtio if Intel doesn't change SDM. And drop this workaround after a solution has been implemented.