Hi Rob, On Mon, Dec 02, 2024 at 07:55:27AM -0600, Rob Herring wrote: >On Mon, Dec 2, 2024 at 3:29 AM Manivannan Sadhasivam ><manivannan.sadhasivam@xxxxxxxxxx> wrote: >> >> On Wed, Nov 27, 2024 at 01:56:50PM -0600, Rob Herring wrote: >> > On Fri, Nov 15, 2024 at 08:17:20PM +0530, Manivannan Sadhasivam wrote: >> > > On Fri, Nov 15, 2024 at 10:14:10AM +0000, Peng Fan wrote: >> > > > Hi Manivannan, >> > > > >> > > > > Subject: Re: [PATCH] PCI: check bridge->bus in >> > > > > pci_host_common_remove >> > > > > >> > > > > On Mon, Oct 28, 2024 at 04:46:43PM +0800, Peng Fan (OSS) wrote: >> > > > > > From: Peng Fan <peng.fan@xxxxxxx> >> > > > > > >> > > > > > When PCI node was created using an overlay and the overlay is >> > > > > > reverted/destroyed, the "linux,pci-domain" property no longer exists, >> > > > > > so of_get_pci_domain_nr will return failure. Then >> > > > > > of_pci_bus_release_domain_nr will actually use the dynamic IDA, >> > > > > even >> > > > > > if the IDA was allocated in static IDA. So the flow is as below: >> > > > > > A: of_changeset_revert >> > > > > > pci_host_common_remove >> > > > > > pci_bus_release_domain_nr >> > > > > > of_pci_bus_release_domain_nr >> > > > > > of_get_pci_domain_nr # fails because overlay is gone >> > > > > > ida_free(&pci_domain_nr_dynamic_ida) >> > > > > > >> > > > > > With driver calls pci_host_common_remove explicity, the flow >> > > > > becomes: >> > > > > > B pci_host_common_remove >> > > > > > pci_bus_release_domain_nr >> > > > > > of_pci_bus_release_domain_nr >> > > > > > of_get_pci_domain_nr # succeeds in this order >> > > > > > ida_free(&pci_domain_nr_static_ida) >> > > > > > A of_changeset_revert >> > > > > > pci_host_common_remove >> > > > > > >> > > > > > With updated flow, the pci_host_common_remove will be called >> > > > > twice, so >> > > > > > need to check 'bridge->bus' to avoid accessing invalid pointer. >> > > > > > >> > > > > > Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") >> > > > > > Signed-off-by: Peng Fan <peng.fan@xxxxxxx> >> > > > > >> > > > > I went through the previous discussion [1] and I couldn't see an >> > > > > agreement on the point raised by Bjorn on 'removing the host bridge >> > > > > before the overlay'. >> > > > >> > > > This patch is an agreement to Bjorn's idea. >> > > > >> > > > I have added pci_host_common_remove to remove host bridge >> > > > before removing overlay as I wrote in commit log. >> > > > >> > > > But of_changeset_revert will still runs into pci_host_ >> > > > common_remove to remove the host bridge again. Per >> > > > my view, the design of of_changeset_revert to remove >> > > > the device tree node will trigger device remove, so even >> > > > pci_host_common_remove was explicitly used before >> > > > of_changeset_revert. The following call to of_changeset_revert >> > > > will still call pci_host_common_remove. >> > > > >> > > > So I did this patch to add a check of 'bus' to avoid remove again. >> > > > >> > > >> > > Ok. I think there was a misunderstanding. Bjorn's example driver, >> > > 'i2c-demux-pinctrl' applies the changeset, then adds the i2c adapter for its >> > > own. And in remove(), it does the reverse. >> > > >> > > But in your case, the issue is with the host bridge driver that gets probed >> > > because of the changeset. While with 'i2c-demux-pinctrl' driver, it only >> > > applies the changeset. So we cannot compare both drivers. I believe in your >> > > case, 'i2c-demux-pinctrl' becomes 'jailhouse', isn't it? >> > > >> > > So in your case, changeset is applied by jailhouse and that causes the >> > > platform device to be created for the host bridge and then the host bridge >> > > driver gets probed. So during destroy(), you call of_changeset_revert() that >> > > removes the platform device and during that process it removes the host bridge >> > > driver. The issue happens because during host bridge remove, it calls >> > > pci_remove_root_bus() and that tries to remove the domain_nr using >> > > pci_bus_release_domain_nr(). >> > > >> > > But pci_bus_release_domain_nr() uses DT node to check whether to free the >> > > domain_nr from static IDA or dynamic IDA. And because there is no DT node exist >> > > at this time (it was already removed by of_changeset_revert()), it forces >> > > pci_bus_release_domain_nr() to use dynamic IDA even though the IDA was initially >> > > allocated from static IDA. >> > >> > Putting linux,pci-domain in an overlay is the same problem as aliases in >> > overlays[1]. It's not going to work well. >> > >> > IMO, you can have overlays, or you can have static domains. You can't >> > have both. >> > >> >> Okay. >> >> > > I think a neat way to solve this issue would be by removing the OF node only >> > > after removing all platform devices/drivers associated with that node. But I >> > > honestly do not know whether that is possible or not. Otherwise, any other >> > > driver that relies on the OF node in its remove() callback, could suffer from >> > > the same issue. And whatever fix we may come up with in PCI core, it will be a >> > > band-aid only. >> > > >> > > I'd like to check with Rob first about his opinion. >> > >> > If the struct device has an of_node set, there should be a reference >> > count on that node. But I think that only prevents the node from being >> > freed. It does not prevent the overlay from being detached. This is one >> > of many of the issues with overlays Frank painstakingly documented[2]. >> > >> >> Ah, I do remember this page as Frank ended up creating it based on my >> continuous nudge to add CONFIG_FS interface for applying overlays. >> >> So why are we applying overlays in kernel now? > >That's been the case for some time. Mostly it's been for fixups of old >to new bindings, but those all got dropped at some point. The in >kernel users are very specific use cases where we know something about >what's in the overlay. In contrast, configfs interface allows for any >change to any node or property with no control over it by the kernel. >Never say never, but I just don't see that ever happening upstream. So should I switch to use configfs for jailhouse case? Currently we use overlay to add a virtual pci node to kernel device tree. Thanks Peng > >Rob