On Fri, Sep 13, 2024 at 05:06:51PM +0530, Neeraj Upadhyay wrote: > Introduction > ------------ > > Secure AVIC is a new hardware feature in the AMD64 architecture to > allow SEV-SNP guests to prevent hypervisor from generating unexpected > interrupts to a vCPU or otherwise violate architectural assumptions > around APIC behavior. > > One of the significant differences from AVIC or emulated x2APIC is that > Secure AVIC uses a guest-owned and managed APIC backing page. It also > introduces additional fields in both the VMCB and the Secure AVIC backing > page to aid the guest in limiting which interrupt vectors can be injected > into the guest. > > Guest APIC Backing Page > ----------------------- > Each vCPU has a guest-allocated APIC backing page of size 4K, which > maintains APIC state for that vCPU. The x2APIC MSRs are mapped at > their corresposing x2APIC MMIO offset within the guest APIC backing > page. All x2APIC accesses by guest or Secure AVIC hardware operate > on this backing page. The backing page should be pinned and NPT entry > for it should be always mapped while the corresponding vCPU is running. > > > MSR Accesses > ------------ > Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based > accesses are not supported. > > Some of the MSR accesses such as ICR writes (with shorthand equal to > self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC > hardware. Other MSR accesses generate a #VC exception. The #VC > exception handler reads/writes to the guest APIC backing page. > As guest APIC backing page is accessible to the guest, the Secure > AVIC driver code optimizes APIC register access by directly > reading/writing to the guest APIC backing page (instead of taking > the #VC exception route). > > In addition to the architected MSRs, following new fields are added to > the guest APIC backing page which can be modified directly by the > guest: > > a. ALLOWED_IRR > > ALLOWED_IRR vector indicates the interrupt vectors which the guest > allows the hypervisor to send. The combination of host-controlled > REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by > hardware to update the IRR vectors of the Guest APIC backing page. > > #Offset #bits Description > 204h 31:0 Guest allowed vectors 0-31 > 214h 31:0 Guest allowed vectors 32-63 > ... > 274h 31:0 Guest allowed vectors 224-255 > > ALLOWED_IRR is meant to be used specifically for vectors that the > hypervisor is allowed to inject, such as device interrupts. Interrupt > vectors used exclusively by the guest itself (like IPI vectors) should > not be allowed to be injected into the guest for security reasons. > > b. NMI Request > > #Offset #bits Description > 278h 0 Set by Guest to request Virtual NMI > > > LAPIC Timer Support > ------------------- > LAPIC timer is emulated by hypervisor. So, APIC_LVTT, APIC_TMICT and > APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest > APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR > VMGEXIT. > > IPI Support > ----------- > Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require > writing (from the Secure AVIC driver) to the IRR vector of the target CPU > backing page and then issuing VMGEXIT for the hypervisor to notify the > target vCPU. > > Driver Implementation Open Points > --------------------------------- > > The Secure AVIC driver only supports physical destination mode. If > logical destination mode need to be supported, then a separate x2apic > driver would be required for supporting logical destination mode. > > Setting of ALLOWED_IRR vectors is done from vector.c for IOAPIC and MSI > interrupts. ALLOWED_IRR vector is not cleared when an interrupt vector > migrates to different CPU. Using a cleaner approach to manage and > configure allowed vectors needs more work. > > > Testing > ------- > > This series is based on top of commit 196145c606d0 "Merge > tag 'clk-fixes-for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux." > > Host Secure AVIC support patch series is at [1]. > > Following tests are done: > > 1) Boot to Prompt using initramfs and ubuntu fs. > 2) Verified timer and IPI as part of the guest bootup. > 3) Verified long run SCF TORTURE IPI test. > 4) Verified FIO test with NVME passthrough. One case that is missing is kexec. If the first kernel set ALLOWED_IRR, but the target kernel doesn't know anything about Secure AVIC, there are going to be a problem I assume. I think we need ->setup() counterpart (->teardown() ?) to get configuration back to the boot state. And get it called from kexec path. -- Kiryl Shutsemau / Kirill A. Shutemov