On Thu, Jan 20, 2022 at 4:52 AM Varad Gautam <varad.gautam@xxxxxxxx> wrote: > > AMD SEV-ES defines a new guest exception that gets triggered on > some vmexits to allow the guest to control what state gets shared > with the host. kvm-unit-tests currently relies on UEFI to provide > this #VC exception handler. > > Switch the tests to install their own #VC handler on early bootup > to process these exits, just after GHCB has been mapped. > > If --amdsev-efi-vc is passed during ./configure, the tests will > continue using the UEFI #VC handler. > > Signed-off-by: Varad Gautam <varad.gautam@xxxxxxxx> > --- > Makefile | 3 +++ > configure | 16 ++++++++++++++++ > lib/x86/amd_sev.c | 3 ++- > lib/x86/amd_sev.h | 1 + > lib/x86/amd_sev_vc.c | 14 ++++++++++++++ > lib/x86/desc.c | 15 +++++++++++++++ > lib/x86/desc.h | 1 + > lib/x86/setup.c | 10 ++++++++++ > x86/Makefile.common | 1 + > 9 files changed, 63 insertions(+), 1 deletion(-) > create mode 100644 lib/x86/amd_sev_vc.c > > diff --git a/Makefile b/Makefile > index 4f4ad23..94a0162 100644 > --- a/Makefile > +++ b/Makefile > @@ -46,6 +46,9 @@ else > $(error Cannot build $(ARCH_NAME) tests as EFI apps) > endif > EFI_CFLAGS := -DTARGET_EFI > +ifeq ($(AMDSEV_EFI_VC),y) > +EFI_CFLAGS += -DAMDSEV_EFI_VC > +endif > # The following CFLAGS and LDFLAGS come from: > # - GNU-EFI/Makefile.defaults > # - GNU-EFI/apps/Makefile > diff --git a/configure b/configure > index 41372ef..c687d9f 100755 > --- a/configure > +++ b/configure > @@ -29,6 +29,7 @@ host_key_document= > page_size= > earlycon= > target_efi= > +amdsev_efi_vc= > > usage() { > cat <<-EOF > @@ -71,6 +72,8 @@ usage() { > Specify a PL011 compatible UART at address ADDR. Supported > register stride is 32 bit only. > --target-efi Boot and run from UEFI > + --amdsev-efi-vc Use UEFI-provided #VC handlers on AMD SEV/ES. Requires > + --target-efi. What's the rationale to make the built-in #VC handler default? Note: I'm not actually saying I have an opinion, one way or the other. But I think the commit description (and cover letter) should explain why we prefer what we're doing. > EOF > exit 1 > } > @@ -138,6 +141,9 @@ while [[ "$1" = -* ]]; do > --target-efi) > target_efi=y > ;; > + --amdsev-efi-vc) > + amdsev_efi_vc=y > + ;; > --help) > usage > ;; > @@ -197,8 +203,17 @@ elif [ "$processor" = "arm" ]; then > processor="cortex-a15" > fi > > +if [ "$amdsev_efi_vc" ] && [ "$arch" != "x86_64" ]; then > + echo "--amdsev-efi-vc requires arch x86_64." > + usage > +fi > + > if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then > testdir=x86 > + if [ "$amdsev_efi_vc" ] && [ -z "$target_efi" ]; then > + echo "--amdsev-efi-vc requires --target-efi." > + usage > + fi > elif [ "$arch" = "arm" ] || [ "$arch" = "arm64" ]; then > testdir=arm > if [ "$target" = "qemu" ]; then > @@ -356,6 +371,7 @@ WA_DIVIDE=$wa_divide > GENPROTIMG=${GENPROTIMG-genprotimg} > HOST_KEY_DOCUMENT=$host_key_document > TARGET_EFI=$target_efi > +AMDSEV_EFI_VC=$amdsev_efi_vc > EOF > if [ "$arch" = "arm" ] || [ "$arch" = "arm64" ]; then > echo "TARGET=$target" >> config.mak > diff --git a/lib/x86/amd_sev.c b/lib/x86/amd_sev.c > index 6672214..bde126b 100644 > --- a/lib/x86/amd_sev.c > +++ b/lib/x86/amd_sev.c > @@ -14,6 +14,7 @@ > #include "x86/vm.h" > > static unsigned short amd_sev_c_bit_pos; > +phys_addr_t ghcb_addr; > > bool amd_sev_enabled(void) > { > @@ -126,7 +127,7 @@ void setup_ghcb_pte(pgd_t *page_table) > * function searches GHCB's L1 pte, creates corresponding L1 ptes if not > * found, and unsets the c-bit of GHCB's L1 pte. > */ > - phys_addr_t ghcb_addr, ghcb_base_addr; > + phys_addr_t ghcb_base_addr; > pteval_t *pte; > > /* Read the current GHCB page addr */ > diff --git a/lib/x86/amd_sev.h b/lib/x86/amd_sev.h > index 6a10f84..afbacf3 100644 > --- a/lib/x86/amd_sev.h > +++ b/lib/x86/amd_sev.h > @@ -54,6 +54,7 @@ efi_status_t setup_amd_sev(void); > bool amd_sev_es_enabled(void); > efi_status_t setup_amd_sev_es(void); > void setup_ghcb_pte(pgd_t *page_table); > +void handle_sev_es_vc(struct ex_regs *regs); > > unsigned long long get_amd_sev_c_bit_mask(void); > unsigned long long get_amd_sev_addr_upperbound(void); > diff --git a/lib/x86/amd_sev_vc.c b/lib/x86/amd_sev_vc.c > new file mode 100644 > index 0000000..8226121 > --- /dev/null > +++ b/lib/x86/amd_sev_vc.c > @@ -0,0 +1,14 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#include "amd_sev.h" > + > +extern phys_addr_t ghcb_addr; > + > +void handle_sev_es_vc(struct ex_regs *regs) > +{ > + struct ghcb *ghcb = (struct ghcb *) ghcb_addr; > + if (!ghcb) { > + /* TODO: kill guest */ > + return; > + } > +} > diff --git a/lib/x86/desc.c b/lib/x86/desc.c > index 16b7256..73aa866 100644 > --- a/lib/x86/desc.c > +++ b/lib/x86/desc.c > @@ -3,6 +3,9 @@ > #include "processor.h" > #include <setjmp.h> > #include "apic-defs.h" > +#ifdef TARGET_EFI > +#include "amd_sev.h" > +#endif > > /* Boot-related data structures */ > > @@ -228,6 +231,9 @@ EX_E(ac, 17); > EX(mc, 18); > EX(xm, 19); > EX_E(cp, 21); > +#ifdef TARGET_EFI > +EX_E(vc, 29); > +#endif > > asm (".pushsection .text \n\t" > "__handle_exception: \n\t" > @@ -293,6 +299,15 @@ void setup_idt(void) > handle_exception(13, check_exception_table); > } > > +void setup_amd_sev_es_vc(void) > +{ > + if (!amd_sev_es_enabled()) > + return; > + > + set_idt_entry(29, &vc_fault, 0); > + handle_exception(29, handle_sev_es_vc); > +} > + > unsigned exception_vector(void) > { > unsigned char vector; > diff --git a/lib/x86/desc.h b/lib/x86/desc.h > index b65539e..4fcbf9f 100644 > --- a/lib/x86/desc.h > +++ b/lib/x86/desc.h > @@ -220,6 +220,7 @@ void set_intr_alt_stack(int e, void *fn); > void print_current_tss_info(void); > handler handle_exception(u8 v, handler fn); > void unhandled_exception(struct ex_regs *regs, bool cpu); > +void setup_amd_sev_es_vc(void); > > bool test_for_exception(unsigned int ex, void (*trigger_func)(void *data), > void *data); > diff --git a/lib/x86/setup.c b/lib/x86/setup.c > index bbd3468..6013602 100644 > --- a/lib/x86/setup.c > +++ b/lib/x86/setup.c > @@ -327,6 +327,16 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) > smp_init(); > setup_page_table(); > > +#ifndef AMDSEV_EFI_VC > + if (amd_sev_es_enabled()) { This is a tautology. `setup_amd_sev_es_vc()` returns immediately upon `!amd_sev_es_enabled()`. Therefore, I think we should remove the check here. > + /* > + * Switch away from the UEFI-installed #VC handler. > + * GHCB has already been mapped at this point. > + */ > + setup_amd_sev_es_vc(); > + } > +#endif /* AMDSEV_EFI_VC */ > + > return EFI_SUCCESS; > } > > diff --git a/x86/Makefile.common b/x86/Makefile.common > index 984444e..65d16e7 100644 > --- a/x86/Makefile.common > +++ b/x86/Makefile.common > @@ -24,6 +24,7 @@ cflatobjs += lib/x86/fault_test.o > cflatobjs += lib/x86/delay.o > ifeq ($(TARGET_EFI),y) > cflatobjs += lib/x86/amd_sev.o > +cflatobjs += lib/x86/amd_sev_vc.o > cflatobjs += lib/efi.o > cflatobjs += x86/efi/reloc_x86_64.o > endif > -- > 2.32.0 > Thank you for adding this patch to the series! I left a comment above asking about the default #VC handler, above. And I'd like to expand on that question with my thoughts here. As far as I understand, mostly from offline conversations with Sean, a big motivation for the built-in #VC handler is that in the (very) long term, people will build a guest firmware without a #VC handler. In other words, eventually (I assume this is many years down the road), the guest firmware will be enlightened enough to avoid NAEs altogether, and instead explicitly call VMGEXIT explicitly, rather than implicitly after triggering a #VC exception. To me, this all sounds very far (e.g., 5-10 years) out. And thus, I still question the value in adding the built-in #VC handler with this as the justification now. However, it's clear from prior discussions that this is the direction preferred by the community. So I'm not pushing back. I'm just making sure we're all on the same page here. Please let me know if I'm mis-understanding this rationale or missing any reasons for why folks want a built-in #VC handler. And again, I think the cover letter should be updated with all the reasons why we're doing what we're doing. Where I think this work could be useful in the shorter term is testing. We have three pieces of code to test: 1. Host-side VMGEXIT handler. 2. Guest-side firmware #VC handler. 3. Guest-side kernel #VC handler. Technically, I guess there are multiple #VC handlers within each guest-side component, depending on the boot stage. But let's consider the steady state handlers that operate after initial guest-side boot stages for now. Using the UEFI's builtin #VC handler tests: - Guest-side firmware #VC handler + host-side VMGEXIT handler. Notably, it does _not_ test the guest-side kernel #VC handler. Looking ahead in the series, I can see that the code for the new builtin kvm-unit-tests #VC handler is basically copied from the Linux kernel #VC handler. Therefore, we could use this work to test: - Guest-side kernel #VC handler + host-side VMGEXIT handler. The difficulty with using this approach to test the guest-side #VC handler is that we now have two copies of the code that can get out of sync: one in the kernel and a 2nd in kvm-unit-tests. However, maybe this is OK. And maybe we can even try to ameliorate this issue with some scripts to compare the two codes or some policy that when we update the Linux kernel proper we send a follow-up patch to kvm-unit-tests. Another approach to test the guest-side kernel #VC handler would be to add tests to run from within a Linux guest. E.g., maybe Linux selftests could work here. We actually have something like this internally, But the way it was written would make it non-trivial to upstream. Thoughts? Tested-by: Marc Orr <marcorr@xxxxxxxxxx>