On Tue, Sep 15, 2020 at 08:24:22AM -0500, Tom Lendacky wrote: > On 9/14/20 3:58 PM, Sean Christopherson wrote: > >> @@ -79,6 +88,9 @@ static inline void kvm_register_write(struct kvm_vcpu *vcpu, int reg, > >> if (WARN_ON_ONCE((unsigned int)reg >= NR_VCPU_REGS)) > >> return; > >> > >> + if (kvm_x86_ops.reg_write_override) > >> + kvm_x86_ops.reg_write_override(vcpu, reg, val); > > > > > > There has to be a more optimal approach for propagating registers between > > vcpu->arch.regs and the VMSA than adding a per-GPR hook. Why not simply > > copy the entire set of registers to/from the VMSA on every exit and entry? > > AFAICT, valid_bits is only used in the read path, and KVM doesn't do anything > > sophistated when it hits a !valid_bits reads. > > That would probably be ok. And actually, the code might be able to just > check the GHCB valid bitmap for valid regs on exit, copy them and then > clear the bitmap. The write code could check if vmsa_encrypted is set and > then set a "valid" bit for the reg that could be used to set regs on entry. > > I'm not sure if turning kvm_vcpu_arch.regs into a struct and adding a > valid bit would be overkill or not. KVM already has space in regs_avail and regs_dirty for GPRs, they're just not used by the get/set helpers because they're always loaded/stored for both SVM and VMX. I assume nothing will break if KVM "writes" random GPRs in the VMSA? I can't see how the guest would achieve any level of security if it wantonly consumes GPRs, i.e. it's the guest's responsibility to consume only the relevant GPRs. If that holds true, than avoiding the copying isn't functionally necessary, and is really just a performance optimization. One potentially crazy idea would be to change vcpu->arch.regs to be a pointer (defaults a __regs array), and then have SEV-ES switch it to point directly at the VMSA array (I think the layout is identical for x86-64?). > >> @@ -4012,6 +4052,99 @@ static bool svm_apic_init_signal_blocked(struct kvm_vcpu *vcpu) > >> (svm->vmcb->control.intercept & (1ULL << INTERCEPT_INIT)); > >> } > >> > >> +/* > >> + * These return values represent the offset in quad words within the VM save > >> + * area. This allows them to be accessed by casting the save area to a u64 > >> + * array. > >> + */ > >> +#define VMSA_REG_ENTRY(_field) (offsetof(struct vmcb_save_area, _field) / sizeof(u64)) > >> +#define VMSA_REG_UNDEF VMSA_REG_ENTRY(valid_bitmap) > >> +static inline unsigned int vcpu_to_vmsa_entry(enum kvm_reg reg) > >> +{ > >> + switch (reg) { > >> + case VCPU_REGS_RAX: return VMSA_REG_ENTRY(rax); > >> + case VCPU_REGS_RBX: return VMSA_REG_ENTRY(rbx); > >> + case VCPU_REGS_RCX: return VMSA_REG_ENTRY(rcx); > >> + case VCPU_REGS_RDX: return VMSA_REG_ENTRY(rdx); > >> + case VCPU_REGS_RSP: return VMSA_REG_ENTRY(rsp); > >> + case VCPU_REGS_RBP: return VMSA_REG_ENTRY(rbp); > >> + case VCPU_REGS_RSI: return VMSA_REG_ENTRY(rsi); > >> + case VCPU_REGS_RDI: return VMSA_REG_ENTRY(rdi); > >> +#ifdef CONFIG_X86_64 Is KVM SEV-ES going to support 32-bit builds? > >> + case VCPU_REGS_R8: return VMSA_REG_ENTRY(r8); > >> + case VCPU_REGS_R9: return VMSA_REG_ENTRY(r9); > >> + case VCPU_REGS_R10: return VMSA_REG_ENTRY(r10); > >> + case VCPU_REGS_R11: return VMSA_REG_ENTRY(r11); > >> + case VCPU_REGS_R12: return VMSA_REG_ENTRY(r12); > >> + case VCPU_REGS_R13: return VMSA_REG_ENTRY(r13); > >> + case VCPU_REGS_R14: return VMSA_REG_ENTRY(r14); > >> + case VCPU_REGS_R15: return VMSA_REG_ENTRY(r15); > >> +#endif > >> + case VCPU_REGS_RIP: return VMSA_REG_ENTRY(rip); > >> + default: > >> + WARN_ONCE(1, "unsupported VCPU to VMSA register conversion\n"); > >> + return VMSA_REG_UNDEF; > >> + } > >> +} > >> + > >> +/* For SEV-ES guests, populate the vCPU register from the appropriate VMSA/GHCB */ > >> +static void svm_reg_read_override(struct kvm_vcpu *vcpu, enum kvm_reg reg) > >> +{ > >> + struct vmcb_save_area *vmsa; > >> + struct vcpu_svm *svm; > >> + unsigned int entry; > >> + unsigned long val; > >> + u64 *vmsa_reg; > >> + > >> + if (!sev_es_guest(vcpu->kvm)) > >> + return; > >> + > >> + entry = vcpu_to_vmsa_entry(reg); > >> + if (entry == VMSA_REG_UNDEF) > >> + return; > >> + > >> + svm = to_svm(vcpu); > >> + vmsa = get_vmsa(svm); > >> + vmsa_reg = (u64 *)vmsa; > >> + val = (unsigned long)vmsa_reg[entry]; > >> + > >> + /* If a GHCB is mapped, check the bitmap of valid entries */ > >> + if (svm->ghcb) { > >> + if (!test_bit(entry, (unsigned long *)vmsa->valid_bitmap)) > >> + val = 0; > > > > Is KVM relying on this being 0? Would it make sense to stuff something like > > 0xaaaa... or 0xdeadbeefdeadbeef so that consumption of bogus data is more > > noticeable? > > No, KVM isn't relying on this being 0. I thought about using something > other than 0 here, but settled on just using 0. I'm open to changing that, > though. I'm not sure if there's an easy way to short-circuit the intercept > and respond back with an error at this point, that would be optimal. Ya, responding with an error would be ideal. At this point, we're taking the same lazy approach for TDX and effectively consuming garbage if the guest requests emulation but doesn't expose the necessary GPRs. That being said, TDX's guest/host ABI is quite rigid, so all the "is this register valid" checks could be hardcoded into the higher level "emulation" flows. Would that also be an option for SEV-ES?