On Thu, Oct 08 2020 at 17:08, David Woodhouse wrote: > On Thu, 2020-10-08 at 13:55 +0100, David Woodhouse wrote: > > (We'd want the x86_vector_domain to actually have an MSI compose > function in the !CONFIG_PCI_MSI case if we did this, of course.) The compose function and the vector domain wrapper can simply move to vector.c > From 2fbc79588d4677ee1cc9df661162fcf1a7da57f0 Mon Sep 17 00:00:00 2001 > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > Date: Thu, 8 Oct 2020 15:44:42 +0100 > Subject: [PATCH 6/5] x86/ioapic: Generate RTE directly from parent irqchip's MSI > message > > The IOAPIC generates an MSI cycle with address/data bits taken from its > Redirection Table Entry in some combination which used to make sense, > but now is just a bunch of bits which get passed through in some > seemingly arbitrary order. > > Instead of making IRQ remapping drivers directly frob the IOAPIC RTE, > let them just do their job and generate an MSI message. The bit > swizzling to turn that MSI message into the IOAPIC's RTE is the same in > all cases, since it's a function of the IOAPIC hardware. The IRQ > remappers have no real need to get involved with that. > > The only slight caveat is that the IOAPIC is interpreting some of > those fields too, and it does want the 'vector' field to be unique > to make EOI work. The AMD IOMMU happens to put its IRTE index in the > bits that the IOAPIC thinks are the vector field, and accommodates > this requirement by reserving the first 32 indices for the IOAPIC. > The Intel IOMMU doesn't actually use the bits that the IOAPIC thinks > are the vector field, so it fills in the 'pin' value there instead. > > Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx> > --- > arch/x86/include/asm/hw_irq.h | 11 +++--- > arch/x86/include/asm/msidef.h | 2 ++ > arch/x86/kernel/apic/io_apic.c | 55 ++++++++++++++++++----------- > drivers/iommu/amd/iommu.c | 14 -------- > drivers/iommu/hyperv-iommu.c | 31 ---------------- > drivers/iommu/intel/irq_remapping.c | 19 +++------- > 6 files changed, 46 insertions(+), 86 deletions(-) Nice :) > +static void mp_swizzle_msi_dest_bits(struct irq_data *irq_data, void *_entry) > +{ > + struct msi_msg msg; > + u32 *entry = _entry; > + > + irq_chip_compose_msi_msg(irq_data, &msg); Duh, for some stupid reason it never occured to me to do it that way. Historically the MSI compose function was part of the MSI PCI chip and I just changed that recently when I reworked the code to make IMS support possible. Historical blinders are pretty sticky :( > + /* > + * They're in a bit of a random order for historical reasons, but > + * the IO/APIC is just a device for turning interrupt lines into > + * MSIs, and various bits of the MSI addr/data are just swizzled > + * into/from the bits of Redirection Table Entry. > + */ > + entry[0] &= 0xfffff000; > + entry[0] |= (msg.data & (MSI_DATA_DELIVERY_MODE_MASK | > + MSI_DATA_VECTOR_MASK)); > + entry[0] |= (msg.address_lo & MSI_ADDR_DEST_MODE_MASK) << 9; > + > + entry[1] &= 0xffff; > + entry[1] |= (msg.address_lo & MSI_ADDR_DEST_ID_MASK) << 12; > +} .... > switch (info->type) { > case X86_IRQ_ALLOC_TYPE_IOAPIC: > - /* Setup IOAPIC entry */ > - entry = info->ioapic.entry; > - info->ioapic.entry = NULL; > - memset(entry, 0, sizeof(*entry)); > - entry->vector = index; > - entry->mask = 0; > - entry->trigger = info->ioapic.trigger; > - entry->polarity = info->ioapic.polarity; > - /* Mask level triggered irqs. */ > - if (info->ioapic.trigger) > - entry->mask = 1; > - break; > - Thanks for cleaning this up! tglx