Am 06.08.22 um 13:58 schrieb Tokunori Ikegami:
Note: Sorry let me resend the mail below as text format since it was
not delivered to the mailing lists as contained HTML subpart.
Hi,
Thanks for your comments.
On 2022/08/06 17:31, Guenter Roeck wrote:
On Sat, Aug 06, 2022 at 02:46:06PM +0900, Tokunori Ikegami wrote:
NVMe drives support host controlled thermal management feature as
optional.
The thermal management temperature are different from the
temperature threshold.
So add functionality to set the throttling temperature values.
Signed-off-by: Tokunori Ikegami <ikegami.t@xxxxxxxxx>
I think actually the suggested attributes are not met with the
throttling temperatures as below.
temp[1-*]_emergency: Temperature emergency max value, for chips
supporting more than two upper temperature limits.
temp[1-*]_lcrit: Temperature critical min value, typically lower
than corresponding temp_min values.
Thermal Management Temperature 1 (TMT1): This field specifies the
temperature, in Kelvins, when the controller begins to transition to
lower power active power states or performs vendor specific thermal
management actions while minimizing the impact on performance (e.g.,
light throttling) in order to attempt to reduce the Composite
Temperature.
Thermal Management Temperature 2 (TMT2): This field specifies the
temperature, in Kelvins, when the controller begins to transition to
lower power active power states or perform vendor specific thermal
management actions regardless of the impact on performance (e.g.,
heavy throttling) in order to attempt to reduce the Composite
Temperature.
Maybe those two throttle thresholds could be represented by tempX_crit and tempX_emergency,
the special throttle effect could be documented in the drivers documentation.
Since tempX_crit is already used to report CCTEMP, maybe this value could be reported with tempX_rated_max instead?
As far as i know, CCTEMP is the maximum composite temperature rating of the NVME device, so reporting is as tempX_rated_max would make sense.
Armin Wolf
NACK. There are several existing limit attributes which can be used
for this purpose. I would suggest to use EMERGENCY and LCRIT attributes.
Furthermore, one can not just extend the hwmon ABI without discussion,
much less as part of a patch introducing its use. Any attribute
introduced
into the ABI must benefit more than one device, and a matching
implementation in the sensors command and the lm-sensors library is
expected.
Sorry I am not sure about the hwmon ABI situation but if possible
could you please consider or discuss to extend the attributes from
this patch review since the suggested attributes seem difficult to use
instead? (Is it difficult?)
By the way I have already created the lm-sensors pull request below.
<https://github.com/lm-sensors/lm-sensors/pull/406>
Regards,
Ikegami
Guenter