Hi Jim,
Am 24.05.22 um 18:54 schrieb Jim Quinlan:
On Mon, May 23, 2022 at 6:10 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
On Sat, May 21, 2022 at 02:51:42PM -0400, Jim Quinlan wrote:
On Sat, May 21,
2CONFIG_INITRAMFS_SOURCE="/work3/jq921458/cpio/54-arm64-rootfs.cpio022
at 12:43 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
On Wed, May 18, 2022 at 03:42:11PM -0400, Jim Quinlan wrote:
commit 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice
voltage regulators")
introduced a regression on the PCIe RPi4 Compute Module. If the
PCIe endpoint node described in [2] was missing, no linkup would
be attempted, and subsequent accesses would cause a panic
because this particular PCIe HW causes a CPU abort on illegal
accesses (instead of returning 0xffffffff).
We fix this by allowing the DT endpoint subnode to be missing.
This is important for platforms like the CM4 which have a
standard PCIe socket and the endpoint device is unknown.
I think the problem here is that on the CM, we try to enumerate
devices that are not powered up, isn't it? The commit log should
say something about that power situation and how the driver learns
about the power regulators instead of just pointing at an DT
endpoint node.
This is incorrect. The regression occurred because the code
mistakenly skips PCIe-linkup if the PCI portdrv DT node does not
exist. With our RC HW, doing a config space access to bus 1 w/o
first linking up results in a CPU abort. This regression has
nothing to do with EP power at all.
OK, I think I'm starting to see, but I'm still missing some things.
67211aadcb4b ("PCI: brcmstb: Add mechanism to turn on subdev
regulators") added pci_subdev_regulators_add_bus() as an .add_bus()
method. This is called by pci_alloc_child_bus(), and if the DT
describes any regulators for the bridge leading to the new child bus,
we turn them on.
Then 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage
regulators") added brcm_pcie_add_bus() and made *it* the .add_bus()
method. It turns on the regulators and brings the link up, but it
skips both if there's no DT node for the bridge to the new bus.
Hi Bjorn,
Yes, I meant it to skip the turning on of the regulators if the DT
node was missing
but I failed to notice that it would also skip the pcie linkup as well. As you
may have guessed, all of my test systems have the PCIe root port
DT node.
I guess RPi4 CM has no DT node to describe regulators, so we skip both
turning them on *and* bringing the link up?
Yes. One repo did not have this node (Cyril/debina?), one did
(https://github.com/raspberrypi/firmware/tree/master/boot).
Of course there is nothing wrong with omitting the node; it should
have pcie linkup regardless.
Please ignore the vendor tree, because you only have to care about
mainline kernel and DT here.
But above you say it's the *endpoint* node that doesn't exist. The
existing code looks like it's checking for the *bridge* node
(bus->dev->of_node). We haven't even enumerated the devices on the
child bus, so we don't know about them at this point.
You are absolutely correct and I must change the commit message
to say the "root port DT node". I'm sorry; this mistake likely did not
help you understand the fix. :-(
What happens if there is a DT node for the bridge, but it doesn't
describe any regulators? I assume regulator_bulk_get() will fail, and
it looks like that might still keep us from bringing the link up?
The regulator_bulk_get() func does not fail if the regulators are not
present. Instead it "gets"
a dummy device and issues a warning per missing regulator.
A version of my pullreq submitted code to prescan the DT node and call
regulator_bulk_get() with
only the names of the regulators present, but IIRC this was NAKd.
Hopefully I will not be swamped with RPi developers' emails when they
think these warnings are an issue.
This won't be the first driver complaining about missing regulators and
won't be the last one. So don't expect an email from me ;-)
Best regards