Re: [PATCH v2] x86/sev: Fix host kdump support for SNP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 04, 2024, Ashish Kalra wrote:
> On 9/4/2024 2:54 PM, Michael Roth wrote:
> >   - Sean inquired about making the target kdump kernel more agnostic to
> >     whether or not SNP_SHUTDOWN was done properly, since that might
> >     allow for capturing state even for edge cases where we can't go
> >     through the normal cleanup path. I mentioned we'd tried this to some
> >     degree but hit issues with the IOMMU, and when working around that
> >     there was another issue but I don't quite recall the specifics.
> >     Can you post a quick recap of what the issues are with that approach
> >     so we can determine whether or not this is still an option?
> 
> Yes, i believe without SNP_SHUTDOWN, early_enable_iommus() configure the
> IOMMUs into an IRQ remapping configuration causing the crash in
> io_apic.c::check_timer().
> 
> It looks like in this case, we enable IRQ remapping configuration *earlier*
> than when it needs to be enabled and which causes the panic as indicated:
> 
> EMERGENCY [    1.376701] Kernel panic - not syncing: timer doesn't work
> through Interrupt-remapped IO-APIC

I assume the problem is that IOMMU setup fails in the kdump kernel, not that it
does the setup earlier.  That's that part I want to understand.

Based on the SNP ABI:

  The firmware initializes the IOMMU to perform RMP enforcement. The firmware also
  transitions the event log, PPR log, and completion wait buffers of the IOMMU to
  an RMP page state that is read only to the hypervisor and cannot be assigned to
  guests.

and commit f366a8dac1b8 ("iommu/amd: Clean up RMP entries for IOMMU pages during
SNP shutdown"), my understanding is that the pages used for the IOMMU logs are
forced to read-only for the IOMMU, and so attempting to access those pages in the
kdump kernel will result in an RMP #PF.

That's quite unfortunate, as it means my idea of eating RMP #PFs doesn't really
work, because that idea is based on the assumption that only guest private memory
would generate unexpected RMP #PFs  :-(

> Next, we tried with amd_iommu=off, with that we don't get the irq remapping
> panic during crashkernel boot, but boot still hangs before starting kdump
> tools.
> 
> So eventually we discovered that irqremapping is required for x2apic and with
> amd_iommu=off we don't enable irqremapping at all.

Yeah, that makes sense, as does failing to boot if the system isn't configured
properly, i.e. can't send interrupts to all CPUs.





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux