Re: Should SEV-ES #VC use IST? (Re: [PATCH] Allow RDTSC and RDTSCP from userspace)

Andy Lutomirski <luto@xxxxxxxxxx> · Tue, 23 Jun 2020 11:26:52 -0700

On Tue, Jun 23, 2020 at 8:23 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 23/06/2020 14:03, Peter Zijlstra wrote:
> > On Tue, Jun 23, 2020 at 02:12:37PM +0200, Joerg Roedel wrote:
> >> On Tue, Jun 23, 2020 at 01:50:14PM +0200, Peter Zijlstra wrote:
> >>> If SNP is the sole reason #VC needs to be IST, then I'd strongly urge
> >>> you to only make it IST if/when you try and make SNP happen, not before.
> >> It is not the only reason, when ES guests gain debug register support
> >> then #VC also needs to be IST, because #DB can be promoted into #VC
> >> then, and as #DB is IST for a reason, #VC needs to be too.
> > Didn't I read somewhere that that is only so for Rome/Naples but not for
> > the later chips (Milan) which have #DB pass-through?
>
> I don't know about hardware timelines, but some future part can now opt
> in to having debug registers as part of the encrypted state, and swapped
> by VMExit, which would make debug facilities generally usable, and
> supposedly safe to the #DB infinite loop issues, at which point the
> hypervisor need not intercept #DB for safety reasons.
>
> Its worth nothing that on current parts, the hypervisor can set up debug
> facilities on behalf of the guest (or behind its back) as the DR state
> is unencrypted, but that attempting to intercept #DB will redirect to
> #VC inside the guest and cause fun. (Also spare a thought for 32bit
> kernels which have to cope with userspace singlestepping the SYSENTER
> path with every #DB turning into #VC.)

What do you mean 32-bit?  64-bit kernels have exactly the same
problem.  At least the stack is okay, though.

Anyway, since I'm way behind on this thread, here are some thoughts:

First, I plan to implement actual precise recursion detection for the
IST stacks.  We'll be able to reliably panic when unallowed recursion
happens.

Second, I don't object *that* strongly to switching to a second #VC
stack if an NMI or MCE happens, but we really need to make sure we
cover *all* the bases.  And #VC is distressingly close to "happens at
all kinds of unfortunate times and the guest doesn't actually have
much ability to predice it" right now.  So we have #VC + #DB + #VC,
#VC + NMI + #VC, #VC + MCE + #VC, and even worse options.  So doing
the shift in a reliable way is not necessarily possible in a clean
way.

Let me contemplate.   And maybe produce some code soon.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization