On 21/03/2023 12:12 am, Li, Xin3 wrote: >>> If there is no other concrete reason other than overflowing for >>> assigning NMI and #DB with a stack level > 0, #VE should also be >>> assigned with a stack level > 0, and #BP too. #VE can happen anytime >>> and anywhere, so it is subject to overflowing too. >> So #BP needs the stack-gap (redzone) for text_poke_bp(). >> >> #BP can end up in kprobes which can then end up in ftrace/perf, depending on >> how it's all wired up. >> >> #VE is currently a trainwreck vs NMI/MCE, but I think FRED solves the worst of >> that. I'm not exactly sure how deep the #VE handler goes. >> > VE under IDT is *not* using an IST, we need some solid rationales here. #VE, and #VC on AMD, are borderline unusable. Both under IDT and FRED. The reason #VE is not IST is because there are plenty of real cases where a non-malicious outer hypervisor could create reentrant faults that lose program state. e.g. hitting an IO instruction, then hitting an emulated MSR. There are fewer cases where a non-IST #VE ends up in a re-entrant fault (IIRC, you can still manage it by unmapping the entry stack), but you're still trusting the outer hypervisor to not e.g. unmap the SYSCALL entry point. FRED gets rid of the "reentrant fault overwriting it on the stack" case, and removes the syscall gap case, replacing them instead with a stack overflow in the worst case because there is still no upper bound to how many times #VE can actually be delivered in the course of servicing a single #VE. ~Andrew P.S. While I hate to cite myself, if you haven't read https://docs.google.com/document/d/1hWejnyDkjRRAW-JEsRjA5c9CKLOPc6VKJQsuvODlQEI/edit?usp=sharing yet, do so. It did feed into some of the FRED design.