On 13/04/2023 11:24, Suthikulpanit, Suravee wrote: > On 3/17/2023 3:02 AM, Joao Martins wrote: >> GALog exists to propagate interrupts into all vCPUs in the system when >> interrupts are marked as non running (e.g. when vCPUs aren't running). A >> GALog overflow happens when there's in no space in the log to record the >> GATag of the interrupt. So when the GALOverflow condition happens, the >> GALog queue is processed and the GALog is restarted, as the IOMMU >> manual indicates in section "2.7.4 Guest Virtual APIC Log Restart >> Procedure": >> >> | * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request >> | entries are completed as circumstances allow. GALogRun must be 0b to >> | modify the guest virtual APIC log registers safely. >> | * Write MMIO Offset 0018h[GALogEn]=0b. >> | * As necessary, change the following values (e.g., to relocate or >> | resize the guest virtual APIC event log): >> | - the Guest Virtual APIC Log Base Address Register >> | [MMIO Offset 00E0h], >> | - the Guest Virtual APIC Log Head Pointer Register >> | [MMIO Offset 2040h][GALogHead], and >> | - the Guest Virtual APIC Log Tail Pointer Register >> | [MMIO Offset 2048h][GALogTail]. >> | * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C). >> | * Write MMIO Offset 0018h[GALogEn] = 1b, and either set >> | MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear >> | the bit to disable it. >> >> Failing to handle the GALog overflow means that none of the VFs (in any >> guest) will work with IOMMU AVIC forcing the user to power cycle the >> host. When handling the event it resumes the GALog without resizing >> much like how it is done in the event handler overflow. The >> [MMIO Offset 2020h][GALOverflow] bit might be set in status register >> without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll >> for GA events (to clear space in the galog), also check the overflow >> bit. >> >> [suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN] > > According to the AMD IOMMU spec, > > * The GAInt is set when the virtual interrupt request is written to the GALog > and the IOMMU hardware generates an interrupt when GAInt changes from 0 to 1. > > * The GAOverflow bit is set when a new guest virtual APIC event is to be written > to the GALog and there is no usable entry in the GALog, causing the new event > information to be discarded. No interrupt is generated when GALOverflow is > changed from 0b to 1b. > > So, whenever the IOMMU driver detects GALogOverflow, it should also ensure to > process any existing entries in the GALog. > ... And I am doing all that aren't I? Or do you want me edit the commit message to quote these two bullet points from the IOMMU manual? > Please note that we are working on another patch series to isolate the > interrupts for Event, PPR, and GALog so that each one can be handled separately > in a similar fashion. > Cool, if possible please CC me on that series. Joao