Re: AMD SEV-SNP/Intel TDX: validation of memory pages

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Tue, 16 Feb 2021 17:47:30 +0100

On Tue, Feb 16, 2021 at 04:59:52PM +0100, Paolo Bonzini wrote:
> On 16/02/21 15:46, Peter Zijlstra wrote:
> > On Tue, Feb 16, 2021 at 06:27:41AM -0800, Andi Kleen wrote:
> > > I think the IST solution should at least be explored before
> > > dismissing it. It might be simpler than anything else (like
> > > using new APIs)
> > 
> > Have you seen the trainwreck bonzini proposed?
> 
> You had been suspiciously silent...

:-)

> > The very simplest thing is saying no to TDX.
> > 
> > That 'solution' also hard relies on #VE not nesting more than once, so
> > lovely things like: #VE -> #DB -> #VE -> #NMI -> #VE, or #VE -> NMI ->
> > #VE -> #MC -> #VE or any number of other possible 'fun' combinations
> > _must_ not happen.
> 
> ... but no, this is not how it works.  It is actually guaranteed that #VE
> does not nest more than once, and that's the big difference with NMIs.

Note that our NMI entry code is broken vs #MC or any other exception
that can land while we're setting up that recursion mess.

> Let's look at the first case you listed, this is what would happen:
> 
> 
> #VE handler starts on stack 1
> First #VE processing...
> clear VE-in-progress flag in the info block (allowing reentrancy)
> 	#DB handler starts
> 		nested #VE handler starts on stack 2

NMI can't land here because of the special ductape? The inner #VE never
clears VE-in-progress.

> 		outer #VE handler marks stack 1 for reexecution
> 		nested #VE handler ends ***
> 	#DB handler ends

So what does the #DB memop that triggered that #VE actually read? What
if it was a store?

Because clearly it will not have handled the on-demand validation thing.
So how can memops proceed?

> #VE handler IRETs back to the start of the handler itself
> Second #VE processing starts (also on stack 1)
> clear VE-in-progress flag in the info block
> 	#NMI handler
> 		nested #VE handler starts on stack 2
> 		outer #VE handler marks stack 1 for reexecution
> 		nested #VE handler ends ***
> 	#NMI handler ends
> #VE handler IRETs back to the start of the handler itself
> Third #VE processing starts (also on stack 1)
> clear VE-in-progress flag in the info block
> #VE handler IRETs back to the caller
> 
> 
> Two things of note:
> 
> - note that at the points marked *** the nested #VE handler has not allowed
> another exception to come.  That only happens in the outer handler.
> 
> - the inner handler does nothing but telling the outer handler to rerun.
> The way it does it is certainly not pretty, because it has to work at any
> instruction boundary, but at its heart it's basically a do{}while loop.

So this hard relies on inhibiting NMIs and #MC being busted, right? But
I still don't understand what happens to the memops if you don't handle
the #VE.