On Sat, Mar 7, 2020 at 11:01 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Andy Lutomirski <luto@xxxxxxxxxx> writes: > > On Sat, Mar 7, 2020 at 7:47 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > >> The host knows exactly when it injects a async PF and it can store CR2 > >> and reason of that async PF in flight. > >> > >> On the next VMEXIT it checks whether apf_reason is 0. If apf_reason is 0 > >> then it knows that the guest has read CR2 and apf_reason. All good > >> nothing to worry about. > >> > >> If not it needs to be careful. > >> > >> As long as the apf_reason of the last async #PF is not cleared by the > >> guest no new async #PF can be injected. That's already correct because > >> in that case IF==0 which prevents a nested async #PF. > >> > >> If MCE, NMI trigger a real pagefault then the #PF injection needs to > >> clear apf_reason and set the correct CR2. When that #PF returns then the > >> old CR2 and apf_reason need to be restored. > > > > How is the host supposed to know when the #PF returns? Intercepting > > IRET sounds like a bad idea and, in any case, is not actually a > > reliable indication that #PF returned. > > The host does not care about the IRET. It solely has to check whether > apf_reason is 0 or not. That way it knows that the guest has read CR2 > and apf_reason. /me needs actual details Suppose the host delivers an async #PF. apf_reason != 0 and CR2 contains something meaningful. Host resumes the guest. The guest does whatever (gets NMI, and does perf stuff, for example). The guest gets a normal #PF. Somehow the host needs to do: if (apf_reason != 0) { prev_apf_reason = apf_reason; prev_cr2 = cr2; apf_reason = 0; cr2 = actual fault address; } resume guest; Obviously this can only happen if the host intercepts #PF. Let's pretend for now that this is even possible on SEV-ES (it may well be, but I would also believe that it's not. SEV-ES intercepts are weird and I don't have the whole manual in my head. I'm not sure the host has any way to read CR2 for a SEV-ES guest.) So now the guest runs some more and finishes handling the inner #PF. Some time between doing that and running the outer #PF code that reads apf_reason, the host needs to do: apf_reason = prev_apf_reason; cr2 = prev_cr2; prev_apf_reason = 0; How is the host supposed to know when to do that? --Andy