Re: [PATCH net-next 3/3] net: phy: realtek: add hwmon support for temp sensor on RTL822x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10.01.2025 22:10, Andrew Lunn wrote:
>> - over-temp alarm remains set, even if temperature drops below threshold
> 
>> +int rtl822x_hwmon_init(struct phy_device *phydev)
>> +{
>> +	struct device *hwdev, *dev = &phydev->mdio.dev;
>> +	const char *name;
>> +
>> +	/* Ensure over-temp alarm is reset. */
>> +	phy_clear_bits_mmd(phydev, MDIO_MMD_VEND2, RTL822X_VND2_TSALRM, 3);
> 
> So it is possible to clear the alarm.
> 
> I know you wanted to experiment with this some more....
> 
> If the alarm is still set, does that prevent the PHY renegotiating the
> higher link speed? If you clear the alarm, does that allow it to
> renegotiate the higher link speed? Or is a down/up still required?
> Does an down/up clear the alarm if the temperature is below the
> threshold?
> 
> Also, does HWMON support clearing alarms? Writing a 0 to the file? Or
> are they supported to self clear on read?
> 
> I'm wondering if we are heading towards ABI issues? You have defined:
> 
> - over-temp alarm remains set, even if temperature drops below threshold
> 
> so that kind of eliminates the possibility of implementing self
> clearing any time in the future. Explicit clearing via a write is
> probably O.K, because the user needs to take an explicit action.  Are
> there other ABI issues i have not thought about.
> 

According to Guenters feedback the alarm attribute must not be written
and is expected to be self-clearing on read.
If we would clear the alarm in the chip on alarm attribute read, then
we can have the following ugly scenario:

1. Temperature threshold is exceeded and chip reduces speed to 1Gbps
2. Temperature is falling below alarm threshold
3. User uses "sensors" to check the current temperature
4. The implicit alarm attribute read causes the chip to clear the
   alarm and re-enable 2.5Gbps speed, resulting in the temperature
   alarm threshold being exceeded very soon again.

What isn't nice here is that it's not transparent to the user that
a read-only command from his perspective causes the protective measure
of the chip to be cancelled.

There's no existing hwmon attribute meant to be used by the user
to clear a hw alarm once he took measures to protect the chip
from overheating.

> 	Andrew

Heiner




[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux