Hi Andrew, >> When the PHY's temparature is back to normal (low threshold, it is >> 85 degrees) it restores user/default link speed settings. > > Hi Igor > > Please could you also export the temperature using HWMON. The Marvell > PHY drivers are good examples. > > I also have a patch for driver/net/phy/aquantia.c which adds HWMON > support to that PHY. We actually have almost ready patches to submit hwmon support separately for both AQC USB and AQC PCI MAC drivers. Will do that in near time. >> + if (aqc111_data->thermal_throttling) >> + speed = SPEED_100; >> + > > That is a big jump. Do you need to cool is down quickly, or would > 1Gbps be sufficient? I think as a user, i would prefer it ping-pongs > between 5G and 1G, not 5G and 100M. In case overheat happens that really means something bad is happening outside, or heatsink is bad/detached. I'll consult our HW team once again, but 100M was the original request from our phy/hw team. It seems it really much less on power and 1G may not be enough. > >> if (autoneg == AUTONEG_ENABLE) { >> switch (speed) { >> case SPEED_5000: > > It looks like this only works when auto-neg is enabled. If i've fixed > configured it i don't think this works? Looks you are right, will check this. >> + if (aqc111_data->thermal_throttling && >> + temperature <= AQ_NORMAL_TEMP_THRESHOLD) { >> + netdev_info(dev->net, "The temperature is back to normal(%u)", >> + AQ_NORMAL_TEMP_THRESHOLD / 100); > > How often do you see these messages? I'm wondering if they need to be > rate limited? In worst case that will be at least limited by link down/up internal, which in case of 5G normally takes 3-5secs. >> + if (!aqc111_data->thermal_throttling && >> + temperature >= AQ_CRITICAL_TEMP_THRESHOLD) { > > Should there be some hysteresis in here? In fact, if temperature is > AQ_CRITICAL_TEMP_THRESHOLD it is both back to normal, and throttled at > the same time! NORMAL_TEMP and CRITICAL_TEMP values are actually a hysteresis. In cool down case only after TEMP_NORMAL temperature link will be flipped back again. > >> + netdev_warn(dev->net, "Critical temperature(%u) is reached", >> + AQ_CRITICAL_TEMP_THRESHOLD / 100); >> + aqc111_data->thermal_throttling = 1; >> + reset_speed = 1; > > update_speed might be a better name, since you are not always > resetting it. Agreed. Regards, Igor