On Tue, Nov 26, 2019 at 8:30 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > The new of_devlink support breaks PCIe probing on ARM platforms booting > via UEFI if the firmware exposes a EFI framebuffer that is backed by a > PCI device. Thanks for testing with of_devlink enabled! > The reason is that the probing order gets reversed, > resulting in a resource conflict on the framebuffer memory window when > the PCIe probes last, causing it to give up entirely. Just so I understand it clearly, the probe order reversal is only between this efi-framebuffer device and the PCIe device right? Not all PCI devices or something like that, right? Do you have any info on what dependency causes this reversal? Just curious. > Given that we rely on PCI quirks to deal with EFI framebuffers that get > moved around in memory, we cannot simply drop the memory reservation, so > instead, let's use the device link infrastructure to register this > dependency, and force the probing to occur in the expected order. > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Cc: Saravana Kannan <saravanak@xxxxxxxxxx> > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> > --- > drivers/firmware/efi/arm-init.c | 66 ++++++++++++++++++-- > 1 file changed, 61 insertions(+), 5 deletions(-) > > diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c > index 311cd349a862..617226d50774 100644 > --- a/drivers/firmware/efi/arm-init.c > +++ b/drivers/firmware/efi/arm-init.c > @@ -14,6 +14,7 @@ > #include <linux/memblock.h> > #include <linux/mm_types.h> > #include <linux/of.h> > +#include <linux/of_address.h> > #include <linux/of_fdt.h> > #include <linux/platform_device.h> > #include <linux/screen_info.h> > @@ -267,15 +268,70 @@ void __init efi_init(void) > efi_memmap_unmap(); > } > > +static bool __init efifb_overlaps_pci_range(const struct of_pci_range *range) > +{ > + u64 fb_base = screen_info.lfb_base; > + > + if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE) > + fb_base |= (u64)(unsigned long)screen_info.ext_lfb_base << 32; > + > + return fb_base >= range->cpu_addr && > + fb_base < (range->cpu_addr + range->size); > +} > + > static int __init register_gop_device(void) > { > - void *pd; > + struct platform_device *pd; > + struct device_node *np; > + bool found = false; > + int err; > > if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > return 0; > > - pd = platform_device_register_data(NULL, "efi-framebuffer", 0, > - &screen_info, sizeof(screen_info)); > - return PTR_ERR_OR_ZERO(pd); > + pd = platform_device_alloc("efi-framebuffer", 0); > + if (!pd) > + return -ENOMEM; > + > + err = platform_device_add_data(pd, &screen_info, sizeof(screen_info)); > + if (err) > + return err; > + > + /* > + * If the efifb framebuffer is backed by a PCI graphics controller, we > + * have to ensure that this relation is expressed using a device link > + * when running in DT mode, or the probe order may be reversed, > + * resulting in a resource reservation conflict on the memory window > + * that the efifb framebuffer steals from the PCIe host bridge. > + */ > + for_each_node_by_type(np, "pci") { > + struct of_pci_range_parser parser; > + struct of_pci_range range; > + struct device *sup_dev; > + > + if (found) { > + of_node_put(np); > + break; > + } It looks like you are doing this here because you can't break out of two loops when you set found = true. Is that right? If so, I think doing this at the end of the loop would make it more obvious on what's going on. > + > + err = of_pci_range_parser_init(&parser, np); > + if (err) { > + pr_warn("of_pci_range_parser_init() failed: %d\n", err); > + continue; > + } > + > + sup_dev = get_dev_from_fwnode(&np->fwnode); > + > + for_each_of_pci_range(&parser, &range) { > + if (efifb_overlaps_pci_range(&range)) { > + found = true; > + if (!device_link_add(&pd->dev, sup_dev, 0)) > + pr_warn("device_link_add() failed\n"); I think dev_warn(&pd->dev,...) might make the message more useful. Otherwise, it's so confusing. > + break; > + } > + } > + put_device(sup_dev); Can't you do the if (found) here? Another option is to simply do a "goto out;" at the end of the if block where you set found = true. > + } > + return platform_device_add(pd); > } > -subsys_initcall(register_gop_device); > +device_initcall(register_gop_device); Looks like you are doing this so that this efi-framebuffer device gets added after the PCIe device? So that device_add_link() succeeds? I'm wondering if it would be better to implement this as a fwnode_operations.add_links(). Since this efi-framebuffer device won't have any fwnode, you can create your own fwnode and implement the add_links() property. Not a strong opinion on this, but some food for thought. Thanks, Saravana