On Mon, Feb 17, 2025, Naveen N Rao wrote: > On Thu, Feb 13, 2025 at 04:55:13PM -0800, Sean Christopherson wrote: > > On Thu, Feb 13, 2025, Kim Phillips wrote: > > > On 2/11/25 3:46 PM, Sean Christopherson wrote: > > > > On Mon, Feb 10, 2025, Tom Lendacky wrote: > > > > > On 2/7/25 17:34, Kim Phillips wrote: > > > > Third, letting userspace opt-in to something doesn't necessarily mean giving > > userspace full control. Which is the entire reason I asked the question about > > whether or not this can break userspace. E.g. we can likely get away with only > > making select features opt-in, and enforcing everything else by default. > > > > I don't think RESTRICTED_INJECTION or ALTERNATE_INJECTION can work without KVM > > cooperation, so enforcing those shouldn't break anything. > > > > It's still not clear to me that we don't have a bug with DEBUG_SWAP. AIUI, > > DEBUG_SWAP is allowed by default. I.e. if ALLOWED_FEATURES is unsupported, then > > the guest can use DEBUG_SWAP via SVM_VMGEXIT_AP_CREATE without KVM's knowledge. > > In sev_es_prepare_switch_to_guest(), we save host debug register state > (DR0-DR3) only if KVM is aware of DEBUG_SWAP being enabled in the guest > (via vmsa_features). So, from what I can tell, it looks like the guest > will end up overwriting host state if it enables DEBUG_SWAP without > KVM's knowledge? > > Not sure if that's reason enough to enforce ALLOWED_SEV_FEATURES for > DEBUG_SWAP :) > > If ALLOWED_SEV_FEATURES is not supported, we may still have to > unconditionally save the host DR0-DR3 registers. Aha! We do not. The existing KVM code is actually flawed in the opposite direction, in that saving host DR0..DR3 during sev_es_prepare_switch_to_guest() is superfluous. DR7 is reset on #VMEXIT (swap type "C"), and so KVM only needs to ensure DR0..DR3 are restored before loading DR7 with the host's value. KVM takes care of that in common x86 code via hw_breakpoint_restore(). However, there is still a bug, as the AMD-specific *masks* are not restored. KVM doesn't support MSR_F16H_DRx_ADDR_MASK emulation or passthrough for normal guests, so the guest can't set those values either, i.e. will get a #VC. But the masks do need to be restored, because the CPU will clobber them with '0' when DebugSwap is enabled. And of course the DR0..DR3 loads in sev_es_prepare_switch_to_guest() are a complete waste of cycles. *sigh* Ugh, it gets worse. The awfulness goes in both direction. Unless I've misunderstood how this all works, just because *KVM* enables DebugSwap for the BSP doesn't mean the guest will enable DebugSwap for APs. And so KVM _can't_ rely on DebugSwap to actually swap DR0..DR3, because KVM has no way of knowing whether or not a given vCPU actually has it enabled. And that's true even when SEV_ALLOWED_FEATURES comes along, which means treating DR0..DR3 as swap type B is utterly worthless. What a mess. I'll send a small series to try and clean things up. Also, I told y'all so: https://lore.kernel.org/all/YWnbfCet84Vup6q9@xxxxxxxxxx P.S. Can someone please get the APM updated to actually explain WTF enabling Debug Virtualization in SEV_FEATURES does? The APM does not at all match what y'all have told me. A sane reading of the APM would be that DRs are *unconditionally* swap type B when DebugSwap is supported, whereas I've been told from the very beginning[*] that treating them as type B requires software enabling: AMD Milan introduces support for the swapping, as type 'B', of DR[0-3] and DR[0-3]_ADDR_MASK registers. It requires that SEV_FEATURES[5] be set in the VMSA. [*] https://lore.kernel.org/all/20221201021948.9259-3-aik@xxxxxxx