On 2014/12/9 22:59, Marc Zyngier wrote: > On 09/12/14 14:35, Jiang Liu wrote: >> On 2014/12/9 22:27, Marc Zyngier wrote: >>> On 09/12/14 14:11, Jiang Liu wrote: >>>> On 2014/12/9 22:03, Marc Zyngier wrote: >>>>> Hi Gerry, >>>>> >>>>> On 09/12/14 12:47, Jiang Liu wrote: >>>>>> On 2014/12/9 20:12, Marc Zyngier wrote: >>>>>>> Yijing, >>>>>>> >>>>>>> On 09/12/14 11:57, Yijing Wang wrote: >>>>>>>>>>> +void __weak pcibios_set_phb_msi_domain(struct pci_bus *bus) >>>>>>>>>>> +{ >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +static void pci_set_bus_msi_domain(struct pci_bus *bus) >>>>>>>>>>> +{ >>>>>>>>>>> + struct pci_dev *bridge = bus->self; >>>>>>>>>>> + >>>>>>>>>>> + if (!bridge) >>>>>>>>>>> + pcibios_set_phb_msi_domain(bus); >>>>>>>>>>> + else >>>>>>>>>>> + dev_set_msi_domain(&bus->dev, dev_get_msi_domain(&bridge->dev)); >>>>>>>>>>> +} >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Marc, we can not assume pci devices under same phb share the same msi irq domain, >>>>>>>>>> now in x86, pci devices under the same phb may associate different msi irq domain. >>>>>>>> >>>>>>>> Hi Marc, >>>>>>>> >>>>>>>>> >>>>>>>>> Well, this is not supposed to be a perfect solution yet, but instead a >>>>>>>>> basis for discussion. What I'd like to find out is: >>>>>>>>> >>>>>>>>> - What is the minimum granularity for associating a device with its MSI >>>>>>>>> domain in existing platforms? >>>>>>>> >>>>>>>> PCI device, after Gerry's msi irq domain patchset which now in linux-next, >>>>>>>> in x86, we will find msi irq domain by pci_dev. >>>>>>> >>>>>>> Are you *really* associating the MSI domain on a per pci-device basis? >>>>>>> That is, you have devices on the same PCI bus talking to different MSI hw? >>>>>> Hi Marc, >>>>>> This is a little wild:( >>>>>> On x86 platform with Intel VT-d(not the case for AMD-v), >>>>>> interrupt remapping is tight to DMA remapping (IOMMU) unit. >>>>>> For most common cases, IOMMU unit manages PCI bus and its sub-hierarchy. >>>>>> But it may also manage a specific PCI device. This is typically used to >>>>>> provide QoS for audio device by using dedicated IOMMU unit to avoid >>>>>> resource contention on DMA remapping tables. BIOS uses ACPI table to >>>>>> report PCI bus/device to IOMMU unit mapping relationship. (To be honest, >>>>>> I have no really experience with such a hardware platform yet, just for >>>>>> theoretical analysis) >>>>>> On the other hand, we now support hierarchy irqdomain. So to >>>>>> support per-PCI IOMMU unit case, we need maintain irqdomain at PCI >>>>>> device level. >>>>>> This piece of code from your [4/6] is flexible enough, which >>>>>> retrieves msi_domain from PCI device, then fallback to PCI bus, >>>>>> then fallback to platform specific method. >>>>>> domain = dev_get_msi_domain(&dev->dev); >>>>>> if (!domain && dev->bus->msi) >>>>>> domain = dev->bus->msi->domain; >>>>>> if (!domain) >>>>>> domain = arch_get_pci_msi_domain(dev); >>>>> >>>>> OK. But what I'd really like to see is a way to setup the >>>>> device<->domain binding as early as possible, without having to use more >>>>> conditional code in pci_msi_get_domain. >>>>> >>>>> IOW, can we do something similar to what pci_set_bus_msi_domain and >>>>> pci_set_msi_domain do in this patch? >>>> Hi Marc, >>>> I have checked x86 code, we could set pci_dev->msi_domain >>>> when creating PCI devices, just need to find some hook points >>>> into PCI core next step. If arch code doesn't set pci_dev->msi_domain, >>>> PCI MSI core may provide a default way to set pci_dev->msi_domain. >>>> This may make the implementation simpler, I guess:) >>> >>> Right. So following your earlier suggestion, I could make >>> pci_set_msi_domain a weak symbol and let arch code override this. >>> >>> My preference would have been to have arch code to create a set of >>> arch-independent data structures describing the topology, and use that >>> for everything, but maybe that's a bit ambitious for a start. >>> >>> I'll rework the series to make the symbols weak. >> Hi Marc, >> I think we may not need the weak symbol at all. With following >> draft patch, the PCI MSI core may simply do: >> if (pci_dev->dev.msi_domain == NULL) >> dev_set_msi_domain(&dev->dev, >> dev_get_msi_domain(&dev->bus->dev)); >> ----------------------------------------------------------------------- >> Note: the patch won't pass compilation, just to show the key idea:) >> >> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c >> index da163da5fdee..8147d25d4349 100644 >> --- a/arch/x86/kernel/apic/msi.c >> +++ b/arch/x86/kernel/apic/msi.c >> @@ -67,6 +67,23 @@ static struct irq_chip pci_msi_controller = { >> .flags = IRQCHIP_SKIP_SET_WAKE, >> }; >> >> +struct irq_domain *x86_get_pci_msi_domain(struct pci_dev *dev) >> +{ >> + struct irq_domain *domain; >> + struct irq_alloc_info info; >> + >> + init_irq_alloc_info(&info, NULL); >> + info.type = X86_IRQ_ALLOC_TYPE_MSI; >> + info.msi_dev = dev; >> + domain = irq_remapping_get_irq_domain(&info); >> + if (domain == NULL) >> + domain = msi_default_domain; >> + if (domain == NULL) >> + domain = ERR_PTR(-ENOSYS); >> + >> + return domain; >> +} >> + >> int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) >> { >> struct irq_domain *domain; >> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c >> index 7b20bccf3648..a26f30a8bb8f 100644 >> --- a/arch/x86/pci/common.c >> +++ b/arch/x86/pci/common.c >> @@ -652,6 +652,9 @@ int pcibios_add_device(struct pci_dev *dev) >> pa_data = data->next; >> iounmap(data); >> } >> + >> + dev->dev.msi_domain = x86_get_pci_msi_domain(dev); >> + >> return 0; >> } > > Right. So you set the msi_domain using the pcibios_add_device callback. > That will require some minimal surgery (the call to pci_set_msi_domain > happens before the pcibios call, so it needs to be relocated after), but > that seems like a sensible solution to me. So the key point is clear now: The PCI MSI core will try to set a default value for pci_dev->dev.msi_domain if the arch code doesn't do that. Seems like a solution:) > > Thanks! > > M. > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html