On Tue, Feb 16, 2021 at 04:59:52PM +0100, Paolo Bonzini wrote: > On 16/02/21 15:46, Peter Zijlstra wrote: > > On Tue, Feb 16, 2021 at 06:27:41AM -0800, Andi Kleen wrote: > > > I think the IST solution should at least be explored before > > > dismissing it. It might be simpler than anything else (like > > > using new APIs) > > > > Have you seen the trainwreck bonzini proposed? > > You had been suspiciously silent... :-) > > The very simplest thing is saying no to TDX. > > > > That 'solution' also hard relies on #VE not nesting more than once, so > > lovely things like: #VE -> #DB -> #VE -> #NMI -> #VE, or #VE -> NMI -> > > #VE -> #MC -> #VE or any number of other possible 'fun' combinations > > _must_ not happen. > > ... but no, this is not how it works. It is actually guaranteed that #VE > does not nest more than once, and that's the big difference with NMIs. Note that our NMI entry code is broken vs #MC or any other exception that can land while we're setting up that recursion mess. > Let's look at the first case you listed, this is what would happen: > > > #VE handler starts on stack 1 > First #VE processing... > clear VE-in-progress flag in the info block (allowing reentrancy) > #DB handler starts > nested #VE handler starts on stack 2 NMI can't land here because of the special ductape? The inner #VE never clears VE-in-progress. > outer #VE handler marks stack 1 for reexecution > nested #VE handler ends *** > #DB handler ends So what does the #DB memop that triggered that #VE actually read? What if it was a store? Because clearly it will not have handled the on-demand validation thing. So how can memops proceed? > #VE handler IRETs back to the start of the handler itself > Second #VE processing starts (also on stack 1) > clear VE-in-progress flag in the info block > #NMI handler > nested #VE handler starts on stack 2 > outer #VE handler marks stack 1 for reexecution > nested #VE handler ends *** > #NMI handler ends > #VE handler IRETs back to the start of the handler itself > Third #VE processing starts (also on stack 1) > clear VE-in-progress flag in the info block > #VE handler IRETs back to the caller > > > Two things of note: > > - note that at the points marked *** the nested #VE handler has not allowed > another exception to come. That only happens in the outer handler. > > - the inner handler does nothing but telling the outer handler to rerun. > The way it does it is certainly not pretty, because it has to work at any > instruction boundary, but at its heart it's basically a do{}while loop. So this hard relies on inhibiting NMIs and #MC being busted, right? But I still don't understand what happens to the memops if you don't handle the #VE.