On 3/13/2024 6:15 PM, Linus Torvalds wrote:
On Wed, 13 Mar 2024 at 16:29, Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
On this specific commit 7ee988770326fca440472200c3eb58935fe712f6, there
is a 100% failure for at least 3 devices out of the 16 that are running
the test.
Hmm. I have no idea what is going on, and the unimac-mdio probe
function (one of the things that seem to take forever on your setup)
looks fairly simple.
There doesn't even seem to be any timers involved.
That said - one of the things it does is
unimac_mdio_probe ->
unimac_mdio_clk_set ->
clk_prepare_enable
and maybe that's a pattern, because you report that
brcm_pcie_resume_noirq is another problem spot (on resume).
And guess what brcm_pcie_resume_noirq() does?
Yup. clk_prepare_enable().
So I'm wondering if there's some interaction with some clock driver?
That might explain why it shows up on some arm platforms but not
elsewhere.
I may be barking *entirely* up the wrong tree, though. I was just
looking at that unimac probe and going "there's absolutely _nothing_
timer-related here" and that clk thing looked like it might at least
have _some_ relevance.
FWIW, we use the clk-scmi.c driver and the implementation of the SCMI
platform/server resides in the ARM EL3 trusted firmware, that also has
not changed. Ultimately this results in an ARM SMC call made to the
firmware after having posted some SCMI message in a shared memory
region. None of that has changed or is new, but it does also require me
to look drivers/firmware/arm_scmi/ for possible changes.
Thanks!
--
Florian