> > But if KVM just ignores any hardware exception in such a case, the CPU will > > re-generate it once it resumes guest execution, which looks cleaner. > > That's not strictly guaranteed, especially if KVM injected the exception in the > first place. It's definitely broken if KVM is running L2 and L1 injected an > exception, in which case the exception (from L1) doesn't necessarily have > anything at all to do with the code being executed by L2. Didn't think about this case, but it's real. And KVM must inject it. > Ditto for exceptions synthesized and/or migrated from userspace. > > And as Paolo called out, it doesn't work for traps. Yes, to be a bit clearer, AFAIK, the only case is a #DB trap, whose event type is 3, i.e., hardware exception. > > There are also likely edge cases around Accessed bits and whatnot. Do you think, for a hardware *fault* caused by the current instruction, KVM doesn't need to inject it? I'm thinking to add comments about what is crystal clear, what are still vague, helping whoever is interested to understand the scary details. > > The question is, must KVM inject a hardware exception from the IDT vectoring > > information field? Is there any correctness issue if KVM does not? > > Yes. I'm guessing if we start walking through the myriad flows and edge cases, > we'll find more. I do want to see such cases listed in the comments whenever we notice one. > > If no correctness issue, it's better to not do it, > > In a vacuum, if we were developing a hypervisor from scratch, maybe. It's most > definitely not better when we're talking about undoing ~15 years of behavior I also noticed the original code was from 2008, early days of KVM. It's definitely safer to do so. > (and bugs and fixes) in one of the most gnarly areas in x86 virtualization. E.g. see > > https://lore.kernel.org/all/20220830231614.3580124-1-seanjc@xxxxxxxxxx > > for all the work it took to get KVM to correctly handle L1 exception intercept, > and the messy history of the many hacks that came before. It's a fundamental job to inject events from IDT vectoring, a lot of proven reliably working code are based on it, I think this is your point. > In short, I am not willing to even consider such a change without an absolutely > insane amount of tests and documentation proving correctness, _and_ very > strong evidence that such a change would actually benefit anyone. This is more of a direction check, based on the case you mentioned, I think I have the answer already. Thanks! Xin