On Tue, Dec 5, 2023 at 2:05 AM Herve Codina <herve.codina@xxxxxxxxxxx> wrote: > > On Mon, 4 Dec 2023 17:03:21 -0600 > Rob Herring <robh@xxxxxxxxxx> wrote: > > > On Mon, Dec 4, 2023 at 9:30 AM Herve Codina <herve.codina@xxxxxxxxxxx> wrote: > > > > > > Hi Rob, > > > > > > On Mon, 4 Dec 2023 07:59:09 -0600 > > > Rob Herring <robh@xxxxxxxxxx> wrote: > > > > > > [...] > > > > > > > > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > > > > > > index 9c2137dae429..46b252bbe500 100644 > > > > > > --- a/drivers/pci/bus.c > > > > > > +++ b/drivers/pci/bus.c > > > > > > @@ -342,8 +342,6 @@ void pci_bus_add_device(struct pci_dev *dev) > > > > > > */ > > > > > > pcibios_bus_add_device(dev); > > > > > > pci_fixup_device(pci_fixup_final, dev); > > > > > > - if (pci_is_bridge(dev)) > > > > > > - of_pci_make_dev_node(dev); > > > > > > pci_create_sysfs_dev_files(dev); > > > > > > pci_proc_attach_device(dev); > > > > > > pci_bridge_d3_update(dev); > > > > > > diff --git a/drivers/pci/of.c b/drivers/pci/of.c > > > > > > index 51e3dd0ea5ab..e15eaf0127fc 100644 > > > > > > --- a/drivers/pci/of.c > > > > > > +++ b/drivers/pci/of.c > > > > > > @@ -31,6 +31,8 @@ int pci_set_of_node(struct pci_dev *dev) > > > > > > return 0; > > > > > > > > > > > > node = of_pci_find_child_device(dev->bus->dev.of_node, dev->devfn); > > > > > > + if (!node && pci_is_bridge(dev)) > > > > > > + of_pci_make_dev_node(dev); > > > > > > if (!node) > > > > > > return 0; > > > > > > > > > > Maybe it is too early. > > > > > of_pci_make_dev_node() creates a node and fills some properties based on > > > > > some already set values available in the PCI device such as its struct resource > > > > > values. > > > > > We need to have some values set by the PCI infra in order to create our DT node > > > > > with correct values. > > > > > > > > Indeed, that's probably the issue I'm having. In that case, > > > > DECLARE_PCI_FIXUP_HEADER should work. That's later, but still before > > > > device_add(). > > > > > > > > I think modifying sysfs after device_add() is going to race with > > > > userspace. Userspace is notified of a new device, and then the of_node > > > > link may or may not be there when it reads sysfs. Also, not sure if > > > > we'll need DT modaliases with PCI devices, but they won't work if the > > > > DT node is not set before device_add(). > > > > > > Ok, we can try using DECLARE_PCI_FIXUP_HEADER. > > > On your side, is moving from DECLARE_PCI_FIXUP_EARLY to DECLARE_PCI_FIXUP_HEADER > > > fix your QEMU unittest ? > > > > No... I think the problem is we aren't setting the fwnode, just the of_node ptr, but I haven't had a chance to verify that. > > And testing the bridge part crashes. That's because there's a > > dependency on the bridge->subordinate to write out bus-range and > > interrupt-map. I think the fix there is we should just not write those > > properties. The bus range isn't needed because the kernel does its own > > assignments. For interrupt-map, it is only needed if "interrupts" is > > present in the child devices. If not present, then the standard PCI > > swizzling is used. Alternatively, I think the interrupt mapping could > > be simplified to just implement the standard swizzling at each level > > which isn't dependent on any of the devices on the bus. I gave that a > > go where each interrupt-map just points to the parent bridge, but ran > > into an issue that the bridge nodes don't have a phandle. That should > > be fixable, but I'd rather go with the first option. I suppose that > > depends on how the interrupts downstream of the PCI device need to get > > resolved. It could be that the PCI device serves as the interrupt > > controller and can resolve the parent interrupt on its own (which may > > be needed for ACPI host anyways). > > About interrupt, I am a bit stuck on my side. > My dtso (applied at PCI device level) contains the following: > fragment@0 { > target-path=""; > __overlay__ { > pci-ep-bus@0 { > compatible = "simple-bus"; > #address-cells = <1>; > #size-cells = <1>; > > /* > * map @0xe2000000 (32MB) to BAR0 (CPU) > * map @0xe0000000 (16MB) to BAR1 (AMBA) > */ > ranges = <0xe2000000 0x00 0x00 0x00 0x2000000 > 0xe0000000 0x01 0x00 0x00 0x1000000>; > > itc: itc { > compatible = "microchip,lan966x-itc"; > #interrupt-cells = <1>; > interrupt-controller; > reg = <0xe00c0120 0x190>; > }; > > ... > }; > }; > }; > > I have a 'simple-bus' with a 'ranges' property to translate the BAR addresses > then several devices. Among them a interrupt controller (itc). Its parent > interrupt is the one used by the PCI device (INTA). > I cannot describe this parent interrupt in the dtso because to that I need the > parent interrupt phandle which will be know only at runtime. But you don't. The logic to find the interrupt parent is walk up the parent nodes until you find 'interrupt-parent' or '#interrupt-controller' (and interrupt-map always has #interrupt-controller). So your overlay just needs 'interrupts = <1>' for INTA and it should all just work. That of course implies that we need interrupt properties in all the bridges which I was hoping to avoid. In the ACPI case, for DT interrupt parsing to work, we're going to need to end up in an 'interrupt-controller' node somewhere. I think the options are either we walk interrupt-map properties up to the host bridge which then points to something or the PCI device is the interrupt controller. I think the PCI device is the right place. How the downstream interrupts are routed to PCI interrupts are defined by the device. That would work the same way for both DT and ACPI. If you are concerned about implementing in each driver needing this, some library functions can mitigate that. I'm trying to play around with the IRQ domains and get this to work, but not having any luck yet. > Of course, I can modified the overlay applied to tweak the 'interrupt' and > 'interrupt-parent' in the itc node from my PCI device driver at runtime but I > would like to avoid this kind of tweak in the PCI device driver. > This kind of tweak is overlay dependent and needs to be done by each PCI > device driver that need to work with interrupts. > > For BAR addresses translation, we use the 'ranges' property at the PCI device > node to translate 0 0 0 to BAR0, 1 0 0 to BAR1, ... > What do you think about a new 'irq-ranges' property to translate the irq number > and get the irq parent controller base. > > irq-ranges = <child_irq_spec parent_irq_spec length>; Seems fragile as you have to know something about the parent (the # of cells), but you don't have the phandle. If you needed multiple entries, you couldn't parse this. Rob