Re: [PATCH] nvme: hwmon: provide temperature min and max values for each sensor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/10/19 6:17 AM, Akinobu Mita wrote:
According to the NVMe specification, the over temperature threshold and
under temperature threshold features shall be implemented for Composite
Temperature if a non-zero WCTEMP field value is reported in the Identify
Controller data structure.  The features are also implemented for all
implemented temperature sensors (i.e., all Temperature Sensor fields that
report a non-zero value).

This provides the over temperature threshold and under temperature
threshold for each sensor as temperature min and max values of hwmon
sysfs attributes.

The WCTEMP is already provided as a temperature max value for Composite
Temperature, but this change isn't incompatible.  Because the default
value of the over temperature threshold for Composite Temperature is
the WCTEMP.

This also provides alarm attributes for each temperature sensor.  But all
alarm conditions are same, because there is only a single bit in
Critical Warning field that indicates one of the temperature is outside of
a temperature threshold.


I think it would be more appropriate to report the alarm only for the
composite temperature, reason being that we don't really know which individual
sensor it is associated with.

Example output from the "sensors" command:

nvme-pci-0100
Adapter: PCI adapter
Composite:    +53.0 C  (low  = -273.0 C, high = +70.0 C)
                        (crit = +80.0 C)
Sensor 1:     +56.0 C  (low  = -273.0 C, high = +65262.0 C)
Sensor 2:     +51.0 C  (low  = -273.0 C, high = +65262.0 C)
Sensor 5:     +73.0 C  (low  = -273.0 C, high = +65262.0 C)


Have you tried writing the limits ? On my Intel NVME drive (SSDPEKKW512G7), writing
any minimum limit on the Composite temperature sensor results in a temperature
warning, and that warning is sticky until I reset the controller.
I don't see that problem on Samsung SSD 970 EVO 500GB; I have not yet tried others.

root@jupiter:/sys/class/hwmon/hwmon0# sensors nvme-pci-0100
nvme-pci-0100
Adapter: PCI adapter
Composite:    +30.0°C  (low  = -273.0°C, high = +70.0°C)
                       (crit = +80.0°C)

root@jupiter:/sys/class/hwmon/hwmon0# echo 0 > temp1_min
root@jupiter:/sys/class/hwmon/hwmon0# sensors nvme-pci-0100
nvme-pci-0100
Adapter: PCI adapter
Composite:    +30.0°C  (low  =  +0.0°C, high = +70.0°C)  ALARM
                       (crit = +80.0°C)

It doesn't seem to matter which temperature I write; writing -273000 has
the same result.

[This is actually why I didn't use the features commands; not that I had observed
 the problem, but I was concerned that problems like this would show up.]

Cc: Keith Busch <kbusch@xxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Sagi Grimberg <sagi@xxxxxxxxxxx>
Cc: Jean Delvare <jdelvare@xxxxxxxx>
Cc: Guenter Roeck <linux@xxxxxxxxxxxx>
Signed-off-by: Akinobu Mita <akinobu.mita@xxxxxxxxx>
---
This patch depends on the patch "nvme: Add hardware monitoring support" [1]
[1] http://lists.infradead.org/pipermail/linux-nvme/2019-November/027883.html

  drivers/nvme/host/nvme-hwmon.c | 98 ++++++++++++++++++++++++++++++++++++------
  include/linux/nvme.h           |  6 +++
  2 files changed, 90 insertions(+), 14 deletions(-)

diff --git a/drivers/nvme/host/nvme-hwmon.c b/drivers/nvme/host/nvme-hwmon.c
index 5480cbb..79323b2 100644
--- a/drivers/nvme/host/nvme-hwmon.c
+++ b/drivers/nvme/host/nvme-hwmon.c
@@ -15,6 +15,46 @@ struct nvme_hwmon_data {
  	struct mutex read_lock;
  };
+static int nvme_get_temp_thresh(struct nvme_ctrl *ctrl, int sensor, bool under,
+				long *temp)
+{
+	unsigned int threshold = sensor << NVME_TEMP_THRESH_SELECT_SHIFT;
+	int status;
+	int ret;
+
+	if (under)
+		threshold |= NVME_TEMP_THRESH_TYPE_UNDER;
+
+	ret = nvme_get_features(ctrl, NVME_FEAT_TEMP_THRESH, threshold, NULL, 0,
+				&status);
+	if (!ret)
+		*temp = ((status & NVME_TEMP_THRESH_MASK) - 273) * 1000;
+
+	return ret <= 0 ? ret : -EIO;
+}
+
+static int nvme_set_temp_thresh(struct nvme_ctrl *ctrl, int sensor, bool under,
+				long temp)
+{
+	unsigned int threshold = sensor << NVME_TEMP_THRESH_SELECT_SHIFT;
+	int status;
+	int ret;
+
+	temp = temp / 1000 + 273;
+	if (temp > NVME_TEMP_THRESH_MASK)
+		return -EINVAL;
+

Traditionally we use clamp_val() in hwmon drivers to adjust value ranges
for limit attributes, reason being that we can't expect userspace to dig
through per-sensor-type documentation to identify valid limits. Also, note
that the above does not handle negative values well (-274000 -> -274 -> -1).
I would suggest something like

	temp = temp / 1000 + 273;
	temp = clamp_val(temp, 0, NVME_TEMP_THRESH_MASK);

or, if you want to be fancy;

	temp = DIV_ROUND_CLOSEST(temp, 1000) - 273;
	temp = clamp_val(temp, 0, NVME_TEMP_THRESH_MASK);

+	threshold |= temp;
+
+	if (under)
+		threshold |= NVME_TEMP_THRESH_TYPE_UNDER;
+
+	ret = nvme_set_features(ctrl, NVME_FEAT_TEMP_THRESH, threshold, NULL, 0,
+				&status);

I am a bit baffled here. The last parameter of nvme_set_features() (and nvme_get_features)
is a pointer to u32, but status is declared as int. I would have assumed this generates
a compiler warning, but it doesn't, at least not with my version of gcc.

Either case, it might be better to declare status as u32 (unless I did not have enough
coffee and I am missing something).

Also, I assume that the returned status value is irrelevant. I don't find useful
information in the specification, but I may be missing it.

+
+	return ret <= 0 ? ret : -EIO;
+}
+
  static int nvme_hwmon_get_smart_log(struct nvme_hwmon_data *data)
  {
  	int ret;
@@ -39,8 +79,12 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
  	 */
  	switch (attr) {
  	case hwmon_temp_max:
-		*val = (data->ctrl->wctemp - 273) * 1000;
+		err = nvme_get_temp_thresh(data->ctrl, channel, false, val);
+		if (err)
+			*val = (data->ctrl->wctemp - 273) * 1000;

This would report WCTEMP for all sensors on errors, including errors seen while
the controller is resetting. I think it should be something like

		int err = 0;
		...

		if (!channel)
			*val = (data->ctrl->wctemp - 273) * 1000;
		else
			err = nvme_get_temp_thresh(data->ctrl, channel, false, val);
		return err;

assuming we keep using ctrl->wctemp (see below). If changing the upper Composite
temperature sensor limit changes wctemp, and we don't update it, we should not
use it at all after registration and just report the error.

  		return 0;
+	case hwmon_temp_min:
+		return nvme_get_temp_thresh(data->ctrl, channel, true, val);
  	case hwmon_temp_crit:
  		*val = (data->ctrl->cctemp - 273) * 1000;
  		return 0;
@@ -73,6 +117,23 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
  	return err;
  }
+static int nvme_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
+			    u32 attr, int channel, long val)
+{
+	struct nvme_hwmon_data *data = dev_get_drvdata(dev);
+
+	switch (attr) {
+	case hwmon_temp_max:
+		return nvme_set_temp_thresh(data->ctrl, channel, false, val);

Does this change WCTEMP if written on channel 0 ? If so, we would have to update
the cached value of ctrl->wctemp (or never use it after registration).

+	case hwmon_temp_min:
+		return nvme_set_temp_thresh(data->ctrl, channel, true, val);
+	default:
+		break;
+	}
+
+	return -EOPNOTSUPP;
+}
+
  static const char * const nvme_hwmon_sensor_names[] = {
  	"Composite",
  	"Sensor 1",
@@ -105,13 +166,13 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
  			return 0444;
  		break;
  	case hwmon_temp_max:
+	case hwmon_temp_min:
  		if (!channel && data->ctrl->wctemp)
-			return 0444;
+			return 0644;
+		else if (data->log.temp_sensor[channel - 1])
+			return 0644;

This ends up with a negative index into data->log.temp_sensor
if data->ctrl->wctemp == 0. It needs to be
		else if (channel && data->log.temp_sensor[channel - 1])
It can also be written as a single conditional since the return value is the same.

  		break;
  	case hwmon_temp_alarm:
-		if (!channel)
-			return 0444;
-		break;
  	case hwmon_temp_input:
  	case hwmon_temp_label:
  		if (!channel || data->log.temp_sensor[channel - 1])
@@ -126,16 +187,24 @@ static umode_t nvme_hwmon_is_visible(const void *_data,
  static const struct hwmon_channel_info *nvme_hwmon_info[] = {
  	HWMON_CHANNEL_INFO(chip, HWMON_C_REGISTER_TZ),
  	HWMON_CHANNEL_INFO(temp,
-			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_CRIT |
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_CRIT | HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM,
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
  				HWMON_T_LABEL | HWMON_T_ALARM,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL,
-			   HWMON_T_INPUT | HWMON_T_LABEL),
+			   HWMON_T_INPUT | HWMON_T_MAX | HWMON_T_MIN |
+				HWMON_T_LABEL | HWMON_T_ALARM),
  	NULL
  };
@@ -143,6 +212,7 @@ static const struct hwmon_ops nvme_hwmon_ops = {
  	.is_visible	= nvme_hwmon_is_visible,
  	.read		= nvme_hwmon_read,
  	.read_string	= nvme_hwmon_read_string,
+	.write		= nvme_hwmon_write,
  };
static const struct hwmon_chip_info nvme_hwmon_chip_info = {
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 269b2e8..347a44f 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -803,6 +803,12 @@ struct nvme_write_zeroes_cmd {
/* Features */ +enum {
+	NVME_TEMP_THRESH_MASK		= 0xffff,
+	NVME_TEMP_THRESH_SELECT_SHIFT	= 16,
+	NVME_TEMP_THRESH_TYPE_UNDER	= 0x100000,
+};
+
  struct nvme_feat_auto_pst {
  	__le64 entries[32];
  };





[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux