On 8.06.2023 22:15, Andrew Halaney wrote: > With wider usage on more boards, there have been reports of the > following: > > [ 315.016174] qcom-ethqos 20000.ethernet eth0: no phy at addr -1 > [ 315.016179] qcom-ethqos 20000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -19) > > which has been fairly random and isolated to specific boards. > Early reports were written off as a hardware issue, but it has been > prevalent enough on boards that theory seems unlikely. > > In bring up of a newer piece of hardware, similar was seen, but this > time _consistently_. Moving the reset to the mdio bus level (which isn't > exactly a lie, it is the only device on the bus so one could model it as > such) fixed things on that platform. Analysis on sa8540p-ride shows that > the phy's reset is not being handled during the OUI scan if the reset > lives in the phy node: > > # gpio 752 is the reset, and is active low, first mdio reads are the OUI > modprobe-420 [006] ..... 154.738544: mdio_access: stmmac-0 read phy:0x08 reg:0x02 val:0x0141 > modprobe-420 [007] ..... 154.738665: mdio_access: stmmac-0 read phy:0x08 reg:0x03 val:0x0dd4 > modprobe-420 [004] ..... 154.741357: gpio_value: 752 set 1 > modprobe-420 [004] ..... 154.741358: gpio_direction: 752 out (0) > modprobe-420 [004] ..... 154.741360: gpio_value: 752 set 0 > modprobe-420 [006] ..... 154.762751: gpio_value: 752 set 1 > modprobe-420 [007] ..... 154.846857: gpio_value: 752 set 1 > modprobe-420 [004] ..... 154.937824: mdio_access: stmmac-0 write phy:0x08 reg:0x0d val:0x0003 > modprobe-420 [004] ..... 154.937932: mdio_access: stmmac-0 write phy:0x08 reg:0x0e val:0x0014 > > Moving it to the bus level, or specifying the OUI in the phy's > compatible ensures the reset is handled before any mdio access > Here is tracing with the OUI approach (which skips scanning the OUI): > > modprobe-549 [007] ..... 63.860295: gpio_value: 752 set 1 > modprobe-549 [007] ..... 63.860297: gpio_direction: 752 out (0) > modprobe-549 [007] ..... 63.860299: gpio_value: 752 set 0 > modprobe-549 [004] ..... 63.882599: gpio_value: 752 set 1 > modprobe-549 [005] ..... 63.962132: gpio_value: 752 set 1 > modprobe-549 [006] ..... 64.049379: mdio_access: stmmac-0 write phy:0x08 reg:0x0d val:0x0003 > modprobe-549 [006] ..... 64.049490: mdio_access: stmmac-0 write phy:0x08 reg:0x0e val:0x0014 > > The OUI approach is taken given the description matches the situation > perfectly (taken from ethernet-phy.yaml): > > - pattern: "^ethernet-phy-id[a-f0-9]{4}\\.[a-f0-9]{4}$" > description: > If the PHY reports an incorrect ID (or none at all) then the > compatible list may contain an entry with the correct PHY ID > in the above form. > The first group of digits is the 16 bit Phy Identifier 1 > register, this is the chip vendor OUI bits 3:18. The > second group of digits is the Phy Identifier 2 register, > this is the chip vendor OUI bits 19:24, followed by 10 > bits of a vendor specific ID. > > With this in place the sa8540p-ride's phy is probing consistently, so > it seems the floating reset during mdio access was the issue. In either > case, it shouldn't be floating so this improves the situation. The below > link discusses some of the relationship of mdio, its phys, and points to > this OUI compatible as a way to opt out of the OUI scan pre-reset > handling which influenced this decision. > > Link: https://lore.kernel.org/all/dca54c57-a3bd-1147-63b2-4631194963f0@xxxxxxxxx/ > Fixes: 57827e87be54 ("arm64: dts: qcom: sa8540p-ride: Add ethernet nodes") > Signed-off-by: Andrew Halaney <ahalaney@xxxxxxxxxx> > --- Reviewed-by: Konrad Dybcio <konrad.dybcio@xxxxxxxxxx> Konrad > arch/arm64/boot/dts/qcom/sa8540p-ride.dts | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > index 21e9eaf914dd..5a26974dcf8f 100644 > --- a/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > +++ b/arch/arm64/boot/dts/qcom/sa8540p-ride.dts > @@ -171,6 +171,7 @@ mdio { > > /* Marvell 88EA1512 */ > rgmii_phy: phy@8 { > + compatible = "ethernet-phy-id0141.0dd4"; > reg = <0x8>; > > interrupts-extended = <&tlmm 127 IRQ_TYPE_EDGE_FALLING>;