On Mon, Dec 5, 2022 at 6:30 PM Lizhi Hou <lizhi.hou@xxxxxxx> wrote: > > > On 12/1/22 13:12, Rob Herring wrote: > > On Mon, Nov 21, 2022 at 08:43:03AM -0800, Lizhi Hou wrote: > >> The PCI endpoint device such as Xilinx Alveo PCI card maps the register > >> spaces from multiple hardware peripherals to its PCI BAR. Normally, > >> the PCI core discovers devices and BARs using the PCI enumeration process. > >> There is no infrastructure to discover the hardware peripherals that are > >> present in a PCI device, and which can be accessed through the PCI BARs. > >> > >> For Alveo PCI card, the card firmware provides a flattened device tree to > >> describe the hardware peripherals on its BARs. The Alveo card driver can > >> load this flattened device tree and leverage device tree framework to > >> generate platform devices for the hardware peripherals eventually. > >> > >> Apparently, the device tree framework requires a device tree node for the > >> PCI device. Thus, it can generate the device tree nodes for hardware > >> peripherals underneath. Because PCI is self discoverable bus, there might > >> not be a device tree node created for PCI devices. This patch is to add > >> support to generate device tree node for PCI devices. > >> > >> Added a kernel option. When the option is turned on, the kernel will > >> generate device tree nodes for PCI bridges unconditionally. > >> > >> Initially, the basic properties are added for the dynamically generated > >> device tree nodes. > >> > >> Signed-off-by: Lizhi Hou <lizhi.hou@xxxxxxx> > >> Signed-off-by: Sonal Santan <sonal.santan@xxxxxxx> > >> Signed-off-by: Max Zhen <max.zhen@xxxxxxx> > >> Reviewed-by: Brian Xu <brian.xu@xxxxxxx> > >> --- > >> drivers/pci/Kconfig | 12 ++ > >> drivers/pci/Makefile | 1 + > >> drivers/pci/bus.c | 2 + > >> drivers/pci/msi/irqdomain.c | 6 +- > >> drivers/pci/of.c | 71 ++++++++++ > >> drivers/pci/of_property.c | 256 ++++++++++++++++++++++++++++++++++++ > >> drivers/pci/pci-driver.c | 3 +- > >> drivers/pci/pci.h | 19 +++ > >> drivers/pci/remove.c | 1 + > >> 9 files changed, 368 insertions(+), 3 deletions(-) > >> create mode 100644 drivers/pci/of_property.c > >> > >> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig > >> index 55c028af4bd9..126c31b79718 100644 > >> --- a/drivers/pci/Kconfig > >> +++ b/drivers/pci/Kconfig > >> @@ -198,6 +198,18 @@ config PCI_HYPERV > >> The PCI device frontend driver allows the kernel to import arbitrary > >> PCI devices from a PCI backend to support PCI driver domains. > >> > >> +config PCI_DYNAMIC_OF_NODES > >> + bool "Device tree node for PCI devices" > > Create Devicetree nodes for PCI devices > Sure. > > > > But as I've said before, making this a config option doesn't really work > > except as something to experiment with. Once you add your driver and > > want to do a 'select PCI_DYNAMIC_OF_NODES', you've affected everyone > > else. > > Do you mean we should remove PCI_DYNAMIC_OF_NODES and make > creating dynamic tree node default? No, I'm saying as long as it is a config option, it's not useful for more than experimentation. A distro kernel has to decide how to set a config option for *everyone*. > Based on the previous discussions, the approach I am implementing > is to create device tree nodes for all PCI bridges and devices defined > by pci quirks. I believe a PCI endpoint which is not defined by PCI quirks > should not to be affected because there is no device tree node is created > for it. > > Are you fine with this approach? How does that work? The quirks run when a device is discovered. At that time you've already discovered and probed everything upstream of the device. So the only thing controlling the upstream devices getting a DT node is the config option, right? > >> + depends on OF > >> + select OF_DYNAMIC > >> + help > >> + This option enables support for generating device tree nodes for some > >> + PCI devices. Thus, the driver of this kind can load and overlay > >> + flattened device tree for its downstream devices. > >> + > >> + Once this option is selected, the device tree nodes will be generated > >> + for all PCI bridges. > >> + > >> choice > >> prompt "PCI Express hierarchy optimization setting" > >> default PCIE_BUS_DEFAULT > >> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile > >> index 2680e4c92f0a..cc8b4e01e29d 100644 > >> --- a/drivers/pci/Makefile > >> +++ b/drivers/pci/Makefile > >> @@ -32,6 +32,7 @@ obj-$(CONFIG_PCI_P2PDMA) += p2pdma.o > >> obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o > >> obj-$(CONFIG_VGA_ARB) += vgaarb.o > >> obj-$(CONFIG_PCI_DOE) += doe.o > >> +obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o > >> > >> # Endpoint library must be initialized before its users > >> obj-$(CONFIG_PCI_ENDPOINT) += endpoint/ > >> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > >> index 3cef835b375f..8507cc32b61d 100644 > >> --- a/drivers/pci/bus.c > >> +++ b/drivers/pci/bus.c > >> @@ -316,6 +316,8 @@ void pci_bus_add_device(struct pci_dev *dev) > >> */ > >> pcibios_bus_add_device(dev); > >> pci_fixup_device(pci_fixup_final, dev); > >> + if (pci_is_bridge(dev)) > >> + of_pci_make_dev_node(dev); > >> pci_create_sysfs_dev_files(dev); > >> pci_proc_attach_device(dev); > >> pci_bridge_d3_update(dev); > >> diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c > >> index e9cf318e6670..eeaf44169bfd 100644 > >> --- a/drivers/pci/msi/irqdomain.c > >> +++ b/drivers/pci/msi/irqdomain.c > >> @@ -230,8 +230,10 @@ u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev) > >> pci_for_each_dma_alias(pdev, get_msi_id_cb, &rid); > >> > >> of_node = irq_domain_get_of_node(domain); > >> - rid = of_node ? of_msi_map_id(&pdev->dev, of_node, rid) : > >> - iort_msi_map_id(&pdev->dev, rid); > >> + if (of_node && !of_node_check_flag(of_node, OF_DYNAMIC)) > >> + rid = of_msi_map_id(&pdev->dev, of_node, rid); > >> + else > >> + rid = iort_msi_map_id(&pdev->dev, rid); > > I have no idea if this works. It looks kind of broken already if > > !of_node calls iort_msi_map_id(). With a DT only system, I would think > > we'd always call of_msi_map_id(). Have you tested MSIs? > > > > With a mixed system, I have no idea what happens. That needs to be > > understood for MSI, DMA, and interrupts. > > Yes, I tested MSI in VM. > > # cat > /sys/devices/platform/3f000000.pcie/pci0000:00/0000:00:02.0/0000:09:00.0/0000:0a:00.0/msi_irqs/29 > > msi > # cat /proc/interrupts | grep 29 > 29: 1 0 MSI 5242880 Edge pciehp > > The idea is to preserve the current behaviror. > > current: PCI device does not have dt node, thus iort_msi_map_id() > is called. > > modified code: PCI device has dt node but with OF_DYNAMIC flag, > thus iort_msi_map_id() is called. > > I was planning to take on of_msi_map_id() for dynamically generated dt node > > in future when we see a real use case? > > > > >> > >> return rid; > >> } > >> diff --git a/drivers/pci/of.c b/drivers/pci/of.c > >> index 196834ed44fe..fb60b04f0b93 100644 > >> --- a/drivers/pci/of.c > >> +++ b/drivers/pci/of.c > >> @@ -469,6 +469,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args * > >> } else { > >> /* We found a P2P bridge, check if it has a node */ > >> ppnode = pci_device_to_OF_node(ppdev); > >> + if (of_node_check_flag(ppnode, OF_DYNAMIC)) > >> + ppnode = NULL; > >> } > >> > >> /* > >> @@ -599,6 +601,75 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge) > >> return pci_parse_request_of_pci_ranges(dev, bridge); > >> } > >> > >> +#if IS_ENABLED(CONFIG_PCI_DYNAMIC_OF_NODES) > >> + > >> +void of_pci_remove_node(struct pci_dev *pdev) > >> +{ > >> + struct device_node *dt_node; > > node or np are the typical names. > Will fix this. > > > >> + > >> + dt_node = pci_device_to_OF_node(pdev); > >> + if (!dt_node || !of_node_check_flag(dt_node, OF_DYNAMIC)) > >> + return; > >> + pdev->dev.of_node = NULL; > >> + > >> + of_destroy_node(dt_node); > >> +} > >> + > >> +void of_pci_make_dev_node(struct pci_dev *pdev) > >> +{ > >> + struct device_node *parent, *dt_node = NULL; > >> + const char *pci_type = "dev"; > >> + struct of_changeset *cset; > >> + const char *full_name; > >> + int ret; > >> + > >> + /* > >> + * If there is already a device tree node linked to this device, > >> + * return immediately. > >> + */ > >> + if (pci_device_to_OF_node(pdev)) > >> + return; > >> + > >> + /* Check if there is device tree node for parent device */ > >> + if (!pdev->bus->self) > >> + parent = pdev->bus->dev.of_node; > >> + else > >> + parent = pdev->bus->self->dev.of_node; > >> + if (!parent) > >> + return; > >> + > >> + if (pci_is_bridge(pdev)) > >> + pci_type = "pci"; > > What's the node name if not a bridge? I don't see how that would work. > > > > It should depend on the device class if it has one. > > pci_type is initialized with "dev" > > + const char *pci_type = "dev"; I missed that... > Do you mean I should use class instead of pci_is_bridge()? > > if ((pdev->class >> 24) == PCI_BASE_CLASS_BRIDGE) > pci_type = "pci"; Well, maybe as preparation to support other classes. If you had a UART for example, the node name is 'serial'. I don't think you need to worry about those ATM. We may need some way for the name to come from the driver as not all devices have a class. Yours for example, we'd want something like 'fpga@...' ideally. Maybe that's fine to leave as a known issue. Rob