On Wed, Apr 05, 2023, Li, Xin3 wrote: > The VMCS IDT vectoring information field is used to report basic information > associated with the event that was being delivered when a VM exit occurred. > such an event itself doesn't trigger a VM exit, however, a condition to deliver > the event is not met, e.g., EPT violation. > > When the IDT vectoring information field reports a maskable external interrupt, > KVM reinjects this external interrupt after handling the VM exit. Otherwise, > the external interrupt is lost. > > KVM handles a hardware exception reported in the IDT vectoring information > field in the same way, which makes nothing wrong. This piece of code is in > __vmx_complete_interrupts(): > > case INTR_TYPE_SOFT_EXCEPTION: > vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field); > fallthrough; > case INTR_TYPE_HARD_EXCEPTION: > if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK) { > u32 err = vmcs_read32(error_code_field); > kvm_requeue_exception_e(vcpu, vector, err); > } else > kvm_requeue_exception(vcpu, vector); > break; > > But if KVM just ignores any hardware exception in such a case, the CPU will > re-generate it once it resumes guest execution, which looks cleaner. That's not strictly guaranteed, especially if KVM injected the exception in the first place. It's definitely broken if KVM is running L2 and L1 injected an exception, in which case the exception (from L1) doesn't necessarily have anything at all to do with the code being executed by L2. Ditto for exceptions synthesized and/or migrated from userspace. And as Paolo called out, it doesn't work for traps. There are also likely edge cases around Accessed bits and whatnot. > The question is, must KVM inject a hardware exception from the IDT vectoring > information field? Is there any correctness issue if KVM does not? Yes. I'm guessing if we start walking through the myriad flows and edge cases, we'll find more. > If no correctness issue, it's better to not do it, In a vacuum, if we were developing a hypervisor from scratch, maybe. It's most definitely not better when we're talking about undoing ~15 years of behavior (and bugs and fixes) in one of the most gnarly areas in x86 virtualization. E.g. see https://lore.kernel.org/all/20220830231614.3580124-1-seanjc@xxxxxxxxxx for all the work it took to get KVM to correctly handle L1 exception intercept, and the messy history of the many hacks that came before. In short, I am not willing to even consider such a change without an absolutely insane amount of tests and documentation proving correctness, _and_ very strong evidence that such a change would actually benefit anyone.