RE: [PATCH v5 22/34] x86/fred: FRED initialization code

"Li, Xin3" <xin3.li@xxxxxxxxx> · Tue, 21 Mar 2023 07:49:37 +0000

> >>> If there is no other concrete reason other than overflowing for
> >>> assigning NMI and #DB with a stack level > 0, #VE should also be
> >>> assigned with a stack level > 0, and #BP too. #VE can happen anytime
> >>> and anywhere, so it is subject to overflowing too.
> >> So #BP needs the stack-gap (redzone) for text_poke_bp().
> >>
> >> #BP can end up in kprobes which can then end up in ftrace/perf, depending
> on
> >> how it's all wired up.
> >>
> >> #VE is currently a trainwreck vs NMI/MCE, but I think FRED solves the worst of
> >> that. I'm not exactly sure how deep the #VE handler goes.
> >>
> > VE under IDT is *not* using an IST, we need some solid rationales here.
> 
> #VE, and #VC on AMD, are borderline unusable.  Both under IDT and FRED.

Oops!

> The reason #VE is not IST is because there are plenty of real cases
> where a non-malicious outer hypervisor could create reentrant faults
> that lose program state.  e.g. hitting an IO instruction, then hitting
> an emulated MSR.
>
> There are fewer cases where a non-IST #VE ends up in a re-entrant fault
> (IIRC, you can still manage it by unmapping the entry stack), but you're
> still trusting the outer hypervisor to not e.g. unmap the SYSCALL entry
> point.
> 
> FRED gets rid of the "reentrant fault overwriting it on the stack" case,
> and removes the syscall gap case, replacing them instead with a stack
> overflow in the worst case because there is still no upper bound to how
> many times #VE can actually be delivered in the course of servicing a
> single #VE.

Exactly, FRED stack levels can make use of the whole regular stack space.

I guess you don't seem to support #VE on a higher stack level? 

> ~Andrew
> 
> P.S. While I hate to cite myself, if you haven't read
> https://docs.google.com/document/d/1hWejnyDkjRRAW-
> JEsRjA5c9CKLOPc6VKJQsuvODlQEI/edit?usp=sharing
> yet, do so.  It did feed into some of the FRED design.