On Fri, Oct 18, 2024 at 08:03:20AM +0530, Neeraj Upadhyay wrote: > Hi Kirill, > > On 10/17/2024 1:53 PM, Kirill A. Shutemov wrote: > > On Fri, Sep 13, 2024 at 05:06:51PM +0530, Neeraj Upadhyay wrote: > >> Introduction > >> ------------ > >> > >> Secure AVIC is a new hardware feature in the AMD64 architecture to > >> allow SEV-SNP guests to prevent hypervisor from generating unexpected > >> interrupts to a vCPU or otherwise violate architectural assumptions > >> around APIC behavior. > >> > >> One of the significant differences from AVIC or emulated x2APIC is that > >> Secure AVIC uses a guest-owned and managed APIC backing page. It also > >> introduces additional fields in both the VMCB and the Secure AVIC backing > >> page to aid the guest in limiting which interrupt vectors can be injected > >> into the guest. > >> > >> Guest APIC Backing Page > >> ----------------------- > >> Each vCPU has a guest-allocated APIC backing page of size 4K, which > >> maintains APIC state for that vCPU. The x2APIC MSRs are mapped at > >> their corresposing x2APIC MMIO offset within the guest APIC backing > >> page. All x2APIC accesses by guest or Secure AVIC hardware operate > >> on this backing page. The backing page should be pinned and NPT entry > >> for it should be always mapped while the corresponding vCPU is running. > >> > >> > >> MSR Accesses > >> ------------ > >> Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based > >> accesses are not supported. > >> > >> Some of the MSR accesses such as ICR writes (with shorthand equal to > >> self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC > >> hardware. Other MSR accesses generate a #VC exception. The #VC > >> exception handler reads/writes to the guest APIC backing page. > >> As guest APIC backing page is accessible to the guest, the Secure > >> AVIC driver code optimizes APIC register access by directly > >> reading/writing to the guest APIC backing page (instead of taking > >> the #VC exception route). > >> > >> In addition to the architected MSRs, following new fields are added to > >> the guest APIC backing page which can be modified directly by the > >> guest: > >> > >> a. ALLOWED_IRR > >> > >> ALLOWED_IRR vector indicates the interrupt vectors which the guest > >> allows the hypervisor to send. The combination of host-controlled > >> REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by > >> hardware to update the IRR vectors of the Guest APIC backing page. > >> > >> #Offset #bits Description > >> 204h 31:0 Guest allowed vectors 0-31 > >> 214h 31:0 Guest allowed vectors 32-63 > >> ... > >> 274h 31:0 Guest allowed vectors 224-255 > >> > >> ALLOWED_IRR is meant to be used specifically for vectors that the > >> hypervisor is allowed to inject, such as device interrupts. Interrupt > >> vectors used exclusively by the guest itself (like IPI vectors) should > >> not be allowed to be injected into the guest for security reasons. > >> > >> b. NMI Request > >> > >> #Offset #bits Description > >> 278h 0 Set by Guest to request Virtual NMI > >> > >> > >> LAPIC Timer Support > >> ------------------- > >> LAPIC timer is emulated by hypervisor. So, APIC_LVTT, APIC_TMICT and > >> APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest > >> APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR > >> VMGEXIT. > >> > >> IPI Support > >> ----------- > >> Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require > >> writing (from the Secure AVIC driver) to the IRR vector of the target CPU > >> backing page and then issuing VMGEXIT for the hypervisor to notify the > >> target vCPU. > >> > >> Driver Implementation Open Points > >> --------------------------------- > >> > >> The Secure AVIC driver only supports physical destination mode. If > >> logical destination mode need to be supported, then a separate x2apic > >> driver would be required for supporting logical destination mode. > >> > >> Setting of ALLOWED_IRR vectors is done from vector.c for IOAPIC and MSI > >> interrupts. ALLOWED_IRR vector is not cleared when an interrupt vector > >> migrates to different CPU. Using a cleaner approach to manage and > >> configure allowed vectors needs more work. > >> > >> > >> Testing > >> ------- > >> > >> This series is based on top of commit 196145c606d0 "Merge > >> tag 'clk-fixes-for-linus' of > >> git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux." > >> > >> Host Secure AVIC support patch series is at [1]. > >> > >> Following tests are done: > >> > >> 1) Boot to Prompt using initramfs and ubuntu fs. > >> 2) Verified timer and IPI as part of the guest bootup. > >> 3) Verified long run SCF TORTURE IPI test. > >> 4) Verified FIO test with NVME passthrough. > > > > One case that is missing is kexec. > > > > If the first kernel set ALLOWED_IRR, but the target kernel doesn't know > > anything about Secure AVIC, there are going to be a problem I assume. > > > > I think we need ->setup() counterpart (->teardown() ?) to get > > configuration back to the boot state. And get it called from kexec path. > > > > Agree, I haven't fully investigated the changes required to support kexec. > Yes, teardown step might be required to disable Secure AVIC in control msr > and possibly resetting other Secure AVIC configuration. > > Thanks for pointing it out! I will update the details with kexec support > being missing in this series. I think it has to be addressed before it got merged. Or we will get a regression. -- Kiryl Shutsemau / Kirill A. Shutemov