[REGRESSION] Keystone PCI driver probing and SerDes PLL timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

When testing the IOT2050 Advanced M.2 platform with Linux CIP 6.1
we came across a breakage in the probing of the Keystone PCI driver
(drivers/phy/ti/pci-keystone.c). This probing was working correctly
in the previous version we were using, v5.10.

In order to debug this we changed over to mainline Linux and bissecting
lead us to find that commit e611f8cd8717 is the culprit, and with it applied
we get the following messages:

[   10.954597] phy-am654 910000.serdes: Failed to enable PLL
[   10.960153] phy phy-910000.serdes.3: phy poweron failed --> -110
[   10.967485] keystone-pcie 5500000.pcie: failed to enable phy
[   10.973560] keystone-pcie: probe of 5500000.pcie failed with error -110

This timeout is occuring in serdes_am654_enable_pll(), called from the 
phy_ops .power_on() hook.

Due to the nature of the error messages and the contents of the commit we
believe that this is due to an unidentified race condition in the probing of
the Keystone PCI driver when enabling the PHY PLLs, since changes in the
workqueue the deferred probing runs on should not affect if probing works
or not. To further support the existence of a race condition, commit
86bfbb7ce4f6 (a scheduler commit) fixes probing, most likely unintentionally
meaning that the problem may arise in the future again.

One possible explanation is that there are pre-requisites for enabling the PLL
that are not being met when e611f8cd8717 is applied; to see if this is the case
help from people more familiar with the hardware details would be useful.

As official support specifically for the IOT2050 Advanced M.2 platform was
introduced in Linux v6.3 (so in the middle of the commits mentioned above)
all of our testing was done with the latest mainline DeviceTree with [1]
applied on top.

This is being reported as a regression even though technically things are
working with the current state of mainline since we believe the current fix
to be an unintended by-product of other work.

#regzbot introduced: e611f8cd8717

[1]: https://lore.kernel.org/all/cover.1699087938.git.jan.kiszka@xxxxxxxxxxx/




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux