On Wed, Aug 16, 2017 at 09:03:17PM +0200, Radim Krčmář wrote: > 2017-08-16 19:19+0200, Paolo Bonzini: > > On 16/08/2017 18:50, Michael S. Tsirkin wrote: > >> On Wed, Aug 16, 2017 at 03:30:31PM +0200, Paolo Bonzini wrote: > >>> While you can filter out instruction fetches, that's not enough. A data > >>> read could happen because someone pointed the IDT to MMIO area, and who > >>> knows what the VM-exit instruction length points to in that case. > >> > >> Thinking more about it, I don't really see how anything > >> legal guest might be doing with virtio would trigger anything > >> but a fault after decoding the instruction. How does > >> skipping instruction even make sense in the example you give? > > > > There's no such thing as a legal guest. Anything that the hypervisor > > does, that differs from real hardware, is a possible escalation path. > > > > This in fact makes me doubt the EMULTYPE_SKIP patch too. > > The main hack is that we expect EPT misconfig within a given range to be > a MMIO NULL write. I think it is fine -- EMULTYPE_SKIP is a common path > that should have well tested error paths and, IIUC, virtio doesn't allow > any other access, so it is a problem of the guest if a buggy/malicious > application can access virtio memory. > > >>>>> Plus of course it wouldn't be guaranteed to work on nested. > >>>> > >>>> Not sure I got this one. > >>> > >>> Not all nested hypervisors are setting the VM-exit instruction length > >>> field on EPT violations, since it's documented not to be set. > >> > >> So that's probably the real issue - nested virt which has to do it > >> in software at extra cost. We already limit this to intel processors, > > Hm, there is no reason to exclude SVM. > > >> how about we blacklist nested virt for this optimization? > > Not every hypervisor can be easily detected ... Hypervisors that don't set a hypervisor bit in CPUID are violating the spec themselves, aren't they? Anyway, we can add a management option for use in a nested scenario. > KVM uses standard > features and SDM clearly says that the instruction length field is > undefined. True. Let's see whether intel can commit to a stronger definition. I don't think there's any rush to make this change. > We only lose performance if we decode the instruction, but piling > workarounds creates unexpected corner cases. > > I still don't see acceptable alternatives to Paolo's solution. It's just that this has been there for 3 years and people have built a product around this. It's not a feature you can discard out of hand out of theoretical concerns or to improve niche use-cases such as nested virt. -- MST