Hi Jon, On Tue, Jul 30, 2024 at 10:59:59AM +0100, Jon Hunter wrote: > Hi Bartosz, > > On 08/07/2024 08:50, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx> > > > > Checking the firmware register before it complete the boot process makes > > no sense, it will report 0 even if FW is available from internal memory. > > Always wait for FW to boot before continuing or we'll unnecessarily try > > to load it from nvmem/filesystem and fail. > > > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx> > > --- > > drivers/net/phy/aquantia/aquantia_firmware.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c > > index 0c9640ef153b..524627a36c6f 100644 > > --- a/drivers/net/phy/aquantia/aquantia_firmware.c > > +++ b/drivers/net/phy/aquantia/aquantia_firmware.c > > @@ -353,6 +353,10 @@ int aqr_firmware_load(struct phy_device *phydev) > > { > > int ret; > > + ret = aqr_wait_reset_complete(phydev); > > + if (ret) > > + return ret; > > + > > /* Check if the firmware is not already loaded by pooling > > * the current version returned by the PHY. If 0 is returned, > > * no firmware is loaded. > > > Although this fixed another issue we were seeing with this driver, we have > been reviewing this change and have a question about it. > > According to the description for the function aqr_wait_reset_complete() this > function is intended to give the device time to load firmware and check > there is a valid firmware ID. > > If a valid firmware ID (non-zero) is detected, then > aqr_wait_reset_complete() will return 0 (because phy_read_mmd_poll_timeout() > returns 0 on success and -ETIMEDOUT upon a timeout). > > If it times out, then it would appear that with the above code we don't > attempt to load the firmware by any other means? > > Hence, I was wondering if we want this ... > > diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c > b/drivers/net/phy/aquantia/aquantia_firmware.c > index 524627a36c6f..a167f42ae36b 100644 > --- a/drivers/net/phy/aquantia/aquantia_firmware.c > +++ b/drivers/net/phy/aquantia/aquantia_firmware.c > @@ -353,16 +353,12 @@ int aqr_firmware_load(struct phy_device *phydev) > { > int ret; > > - ret = aqr_wait_reset_complete(phydev); > - if (ret) > - return ret; > - > - /* Check if the firmware is not already loaded by pooling > + /* Check if the firmware is not already loaded by polling > * the current version returned by the PHY. If 0 is returned, > - * no firmware is loaded. > + * firmware is loaded. > */ > - ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID); > - if (ret > 0) > + ret = aqr_wait_reset_complete(phydev); > + if (!ret) > goto exit; > > ret = aqr_firmware_load_nvmem(phydev); I agree with your analysis and we also noticed this. But actually, you wouldn't want to ignore other return codes from phy_read_mmd_poll_timeout() like real errors from phy_read_mmd(): -ENODEV, -ENXIO etc. I found that the logic is more readable with a switch/case statement as below. diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c index 524627a36c6f..d839f64471bd 100644 --- a/drivers/net/phy/aquantia/aquantia_firmware.c +++ b/drivers/net/phy/aquantia/aquantia_firmware.c @@ -353,26 +353,33 @@ int aqr_firmware_load(struct phy_device *phydev) { int ret; - ret = aqr_wait_reset_complete(phydev); - if (ret) - return ret; - - /* Check if the firmware is not already loaded by pooling - * the current version returned by the PHY. If 0 is returned, - * no firmware is loaded. + /* Check if the firmware is not already loaded by polling + * the current version returned by the PHY. */ - ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID); - if (ret > 0) - goto exit; + ret = aqr_wait_reset_complete(phydev); + switch (ret) { + case 0: + /* Some firmware is loaded => do nothing */ + return 0; + case -ETIMEDOUT: + /* VEND1_GLOBAL_FW_ID still reads 0 after 2 seconds of polling. + * We don't have full confidence that no firmware is loaded (in + * theory it might just not have loaded yet), but we will + * assume that, and load a new image. + */ + ret = aqr_firmware_load_nvmem(phydev); + if (!ret) + goto exit; - ret = aqr_firmware_load_nvmem(phydev); - if (!ret) - goto exit; + ret = aqr_firmware_load_fs(phydev); + if (ret) + return ret; - ret = aqr_firmware_load_fs(phydev); - if (ret) + break; + default: + /* PHY read error, propagate it to the caller */ return ret; + } -exit: return 0; }