On 12/5/2023 07:55:50 -0800 Guenter Roeck <linux@xxxxxxxxxxxx> wrote: > > On 12/4/23 23:47, Cosmo Chou wrote: > > This driver implements support for temperature monitoring of Astera Labs > > PT5161L series PCIe retimer chips. > > > > This driver implementation originates from the CSDK available at > > Link: https://github.com/facebook/openbmc/tree/helium/common/recipes-lib/retimer-v2.14 > > The communication protocol utilized is based on the I2C/SMBus standard. > > > > Signed-off-by: Cosmo Chou <chou.cosmo@xxxxxxxxx> > > --- > > Documentation/hwmon/index.rst | 1 + > > Documentation/hwmon/pt516xx.rst | 48 +++ > > MAINTAINERS | 8 + > > drivers/hwmon/Kconfig | 10 + > > drivers/hwmon/Makefile | 1 + > > drivers/hwmon/pt516xx.c | 648 ++++++++++++++++++++++++++++++++ > > 6 files changed, 716 insertions(+) > > create mode 100644 Documentation/hwmon/pt516xx.rst > > create mode 100644 drivers/hwmon/pt516xx.c > > > > diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst > > index 72f4e6065bae..2c4df18db663 100644 > > --- a/Documentation/hwmon/index.rst > > +++ b/Documentation/hwmon/index.rst > > @@ -181,6 +181,7 @@ Hardware Monitoring Kernel Drivers > > pmbus > > powerz > > powr1220 > > + pt516xx > > pxe1610 > > pwm-fan > > q54sj108a2 > > diff --git a/Documentation/hwmon/pt516xx.rst b/Documentation/hwmon/pt516xx.rst > > new file mode 100644 > > index 000000000000..945194f9a804 > > --- /dev/null > > +++ b/Documentation/hwmon/pt516xx.rst > > @@ -0,0 +1,48 @@ > > +.. SPDX-License-Identifier: GPL-2.0-or-later > > + > > +Kernel driver pt516xx > > +==================== > > + > > +Supported chips: > > + > > + * Astera Labs PT5161L > > + > > + Prefix: 'pt5161l' > > + > > This should be the driver name and be used for function/variable prefixes. > No idea where XX comes into play. The driver for sure won't support > PT516[0-9][A-Z], and should not suggest that it does. > OK, I will correct it to pt5161l. > > + Addresses: I2C 0x24 > > + > > + Datasheet: http://www.asteralabs.com/wp-content/uploads/2021/03/Astera_Labs_PT5161L_Product_Brief.pdf > > + > > +Authors: Cosmo Chou <cosmo.chou@xxxxxxxxxxxx> > > + > > +Description > > +----------- > > + > > +This driver implements support for temperature monitoring of Astera Labs > > +PT5161L series PCIe retimer chips. > > + > > +This driver implementation originates from the CSDK available at > > +https://github.com/facebook/openbmc/tree/helium/common/recipes-lib/retimer-v2.14 > > +The communication protocol utilized is based on the I2C/SMBus standard. > > + > > That strongly suggests that the code has a copyright and license associated > with it. None of it is mentioned here. > It looks like the base code (the link above) declared GPL2.0+ and Apache2.0. There is already a declaration in the pt516xx.c: (line 1) // SPDX-License-Identifier: GPL-2.0-or-later Could you give some advice about adding the copyright and license? Thanks. > > +For more detailed information and specific implementation details, it is > > +recommended to refer to the CSDK source code available at the provided GitHub > > +link. > > + > > Linux kernel drivers should be self contained. It is fine to reference documentation, > but not out-of-tree source code. > OK. I will remove this paragraph ("For more detailed information...") > > +Sysfs entries > > +---------------- > > + > > +================ ============================================== > > +temp1_input Measured temperature (in millidegrees Celsius) > > +================ ============================================== > > + > > +Debugfs entries > > +---------------- > > + > > +================ ==================================== > > +fw_ver Firmware version of the retimer > > +health Health status (8 bits) > > + [0]: Heartbeat Okay (1'b1: OK) > > + [1]: Firmware loaded Okay (1'b1: OK) > > This is not from the chip but arbitrarily assigned. It should be > reported in separate debugfs files (or all in a single file with > multiple lines and human readable information). > OK. I will separate it to "fw_load_status" and "heartbeat_status". > > + [7:2]: Reserved > > +================ ==================================== > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 788be9ab5b73..492002ffd12c 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -17482,6 +17482,14 @@ F: fs/pstore/ > > F: include/linux/pstore* > > K: \b(pstore|ramoops) > > > > +PT516XX HARDWARE MONITOR DRIVER > > +M: Cosmo Chou <cosmo.chou@xxxxxxxxxxxx> > > +L: linux-hwmon@xxxxxxxxxxxxxxx > > +S: Maintained > > +F: Documentation/devicetree/bindings/hwmon/asteralabs,pt516xx.yaml > > +F: Documentation/hwmon/pt516xx.rst > > +F: drivers/hwmon/pt516xx.c > > + > > PTP HARDWARE CLOCK SUPPORT > > M: Richard Cochran <richardcochran@xxxxxxxxx> > > L: netdev@xxxxxxxxxxxxxxx > > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig > > index cf27523eed5a..3965bec7774a 100644 > > --- a/drivers/hwmon/Kconfig > > +++ b/drivers/hwmon/Kconfig > > @@ -1703,6 +1703,16 @@ source "drivers/hwmon/peci/Kconfig" > > > > source "drivers/hwmon/pmbus/Kconfig" > > > > +config SENSORS_PT516XX > > + tristate "Astera Labs PT516XX PCIe retimer hardware monitoring" > > + depends on I2C > > + help > > + If you say yes here you get support for temperature monitoring > > + on the Astera Labs PT516XX PCIe retimer. > > + > > + This driver can also be built as a module. If so, the module > > + will be called pt516xx. > > + > > config SENSORS_PWM_FAN > > tristate "PWM fan" > > depends on (PWM && OF) || COMPILE_TEST > > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile > > index e84bd9685b5c..1942064cd4e9 100644 > > --- a/drivers/hwmon/Makefile > > +++ b/drivers/hwmon/Makefile > > @@ -179,6 +179,7 @@ obj-$(CONFIG_SENSORS_PC87427) += pc87427.o > > obj-$(CONFIG_SENSORS_PCF8591) += pcf8591.o > > obj-$(CONFIG_SENSORS_POWERZ) += powerz.o > > obj-$(CONFIG_SENSORS_POWR1220) += powr1220.o > > +obj-$(CONFIG_SENSORS_PT516XX) += pt516xx.o > > obj-$(CONFIG_SENSORS_PWM_FAN) += pwm-fan.o > > obj-$(CONFIG_SENSORS_RASPBERRYPI_HWMON) += raspberrypi-hwmon.o > > obj-$(CONFIG_SENSORS_SBTSI) += sbtsi_temp.o > > diff --git a/drivers/hwmon/pt516xx.c b/drivers/hwmon/pt516xx.c > > new file mode 100644 > > index 000000000000..824798559fe1 > > --- /dev/null > > +++ b/drivers/hwmon/pt516xx.c > > @@ -0,0 +1,648 @@ > > +// SPDX-License-Identifier: GPL-2.0-or-later > > + > > +#include <linux/debugfs.h> > > +#include <linux/delay.h> > > +#include <linux/err.h> > > +#include <linux/i2c.h> > > +#include <linux/init.h> > > +#include <linux/hwmon.h> > > +#include <linux/module.h> > > +#include <linux/mutex.h> > > +#include <linux/of_device.h> > > + > > +/** Main-micro-assisted access command codes */ > > This is not a proper kernel documentation. Either case, use C style comments > or C++ style comments, but not both. > Got it. I will correct the style of comments. > > +// Wide register reads > > +#define ARIES_MM_RD_WIDE_REG_2B 0x1d > > +#define ARIES_MM_RD_WIDE_REG_3B 0x1e > > +#define ARIES_MM_RD_WIDE_REG_4B 0x1f > > +#define ARIES_MM_RD_WIDE_REG_5B 0x20 > > +// Wide register writes > > +#define ARIES_MM_WR_WIDE_REG_2B 0x21 > > +#define ARIES_MM_WR_WIDE_REG_3B 0x22 > > +#define ARIES_MM_WR_WIDE_REG_4B 0x23 > > +#define ARIES_MM_WR_WIDE_REG_5B 0x24 > > Please use > > #define<space>NAME<tab>value > Got it. > throughout, please. Also, at least ARIES_MM_WR_WIDE_REG_ defines are > not used. Please refrain from declaring unused defines. > Oops... I will remove the unused code. > And where does ARIES come from ? > It's the SoC name of the PCIe retimer. > > + > > +/** Temperature measurement constants */ > > +// Aries current average temp ADC code CSR > > +#define ARIES_CURRENT_AVG_TEMP_ADC_CSR 0x42c > > + > > +// Main Micro Heartbeat reg > > +#define ARIES_MM_HEARTBEAT_ADDR 0x923 > > + > > +/** Main SRAM */ > > +// AL Main SRAM DMEM offset (A0) > > +#define AL_MAIN_SRAM_DMEM_OFFSET (64 * 1024) > > +// SRAM read command > > +#define AL_TG_RD_LOC_IND_SRAM 0x16 > > + > > +/** Micros */ > > +// Offset for main micro FW info > > +#define ARIES_MAIN_MICRO_FW_INFO (96 * 1024 - 128) > > + > > +/** Path Micro Members */ > > +// FW Info (Major) offset location in struct > > +#define ARIES_MM_FW_VERSION_MAJOR 0 > > +// FW Info (Minor) offset location in struct > > +#define ARIES_MM_FW_VERSION_MINOR 1 > > +// FW Info (Build no.) offset location in struct > > +#define ARIES_MM_FW_VERSION_BUILD 2 > > + > > +/** Offsets for MM assisted accesses */ > > +// Legacy Address and Data registers (using wide registers) > > +// Reg offset to specify Address for MM assisted accesses > > +#define ARIES_MM_ASSIST_REG_ADDR_OFFSET 0xd99 > > +// Reg offset to specify Command > > +#define ARIES_MM_ASSIST_CMD_OFFSET 0xd9d > > + > > +// New Address and Data registers (not using wide registers) > > +// Reg offset to MM SPARE 0 used specify Address[7:0] > > +#define ARIES_MM_ASSIST_SPARE_0_OFFSET 0xd9f > > +// Reg offset to MM SPARE 3 used specify Data Byte 0 > > +#define ARIES_MM_ASSIST_SPARE_3_OFFSET 0xda2 > > + > > +/** Misc block offsets */ > > +// Device Load check register > > +#define ARIES_CODE_LOAD_REG 0x605 > > + > > +/** Value indicating FW was loaded properly */ > > +#define ARIES_LOAD_CODE 0xe > > + > > +/** EEPROM parameters */ > > +// Time delay between checking MM status of EEPROM write (microseconds) > > +#define ARIES_MM_STATUS_TIME 5000 > > + > > +#define ARIES_TEMP_CAL_CODE_DEFAULT 84 > > + > > +/* Struct defining FW version loaded on an Aries device */ > > +struct pt516xx_fw_ver { > > + u8 major; // FW version major release value > > + u8 minor; // FW version minor release value > > + u16 build; // FW version build release value > > +}; > > + > > +/* Each client has this additional data */ > > +struct pt516xx_data { > > + struct i2c_client *client; > > + struct dentry *debugfs; > > + struct pt516xx_fw_ver fw_ver; > > + struct mutex lock; > > + bool init_done; > > + bool pec_enable; // Enable PEC > > + bool code_load_okay; // indicate if code load reg value is expected > > + bool mm_heartbeat_okay; // indicate if Main Micro heartbeat is good > > + bool mm_wide_reg_valid; // MM assisted Wide Register access > > + u8 temp_cal_code_avg; > > +}; > > + > > +static struct dentry *pt516xx_debugfs_dir; > > + > > +/* > > + * Write multiple data bytes to Aries over I2C > > + */ > > +static int pt516xx_write_block_data(struct pt516xx_data *data, u32 address, > > + u8 len, u8 *val) > > +{ > > + struct i2c_client *client = data->client; > > + int ret; > > + u8 remain_len = len; > > + u8 xfer_len, curr_len; > > + u8 buf[16]; > > + u8 cmd = 0x0F; // [7]:pec_en, [6:5]:rsvd, [4:2]:func, [1]:start, [0]:end > > + u8 config = 0x40; // [6]:cfg_type, [4:1]:burst_len, [0]:bit16 of address > > + > > + if (data->pec_enable) > > + cmd |= 0x80; > > + > > + // If byte count is greater than 4, perform multiple iterations > > Most of those comments are pointless. This is obvious from the code. > OK. I will reduce the pointless comments. > > + while (remain_len > 0) { > > + if (remain_len > 4) { > > + curr_len = 4; > > + remain_len -= 4; > > + } else { > > + curr_len = remain_len; > > + remain_len = 0; > > + } > > + > > + buf[0] = config | (curr_len - 1) << 1 | ((address >> 16) & 0x1); > > + buf[1] = (address >> 8) & 0xff; > > + buf[2] = address & 0xff; > > + memcpy(&buf[3], val, curr_len); > > + > > + xfer_len = 3 + curr_len; > > + ret = i2c_smbus_write_block_data(client, cmd, xfer_len, buf); > > + if (ret) > > + return ret; > > + > > + val += curr_len; > > + address += curr_len; > > + } > > + > > + return 0; > > +} > > + > > +/* > > + * Read multiple data bytes from Aries over I2C > > + */ > > +static int pt516xx_read_block_data(struct pt516xx_data *data, u32 address, > > + u8 len, u8 *val) > > +{ > > + struct i2c_client *client = data->client; > > + int ret, tries; > > + u8 remain_len = len; > > + u8 curr_len; > > + u8 wbuf[16], rbuf[24]; > > + u8 cmd = 0x08; // [7]:pec_en, [6:5]:rsvd, [4:2]:func, [1]:start, [0]:end > > + u8 config = 0x00; // [6]:cfg_type, [4:1]:burst_len, [0]:bit16 of address > > + > > + if (data->pec_enable) > > + cmd |= 0x80; > > + > > + // If byte count is greater than 16, perform multiple iterations > > + while (remain_len > 0) { > > + if (remain_len > 16) { > > + curr_len = 16; > > + remain_len -= 16; > > + } else { > > + curr_len = remain_len; > > + remain_len = 0; > > + } > > + > > + wbuf[0] = config | (curr_len - 1) << 1 | > > + ((address >> 16) & 0x1); > > + wbuf[1] = (address >> 8) & 0xff; > > + wbuf[2] = address & 0xff; > > + > > + // Perform read operation > > + for (tries = 0; tries < 3; tries++) { > > + ret = i2c_smbus_write_block_data(client, (cmd | 0x2), 3, > > + wbuf); > > + if (ret) > > + return ret; > > + > > + ret = i2c_smbus_read_block_data(client, (cmd | 0x1), > > + rbuf); > > + if (ret == curr_len) > > + break; > > + } > > + if (tries >= 3) > > + return -ENODATA; > > + > > + memcpy(val, rbuf, curr_len); > > + val += curr_len; > > + address += curr_len; > > + } > > + > > + return 0; > > +} > > + > > +static int pt516xx_read_wide_reg(struct pt516xx_data *data, u32 address, > > + u8 width, u8 *val) > > +{ > > + int ret, tries; > > + u8 buf[8]; > > + u8 status; > > + > > + if (data->mm_wide_reg_valid) { > > + // Write address (3 bytes) > > + buf[0] = address & 0xff; > > + buf[1] = (address >> 8) & 0xff; > > + buf[2] = (address >> 16) & 0x1; > > + ret = pt516xx_write_block_data( > > + data, ARIES_MM_ASSIST_SPARE_0_OFFSET, 3, buf); > > + if (ret) > > + return ret; > > + > > + // Set command based on width > > + switch (width) { > > + case 2: > > + buf[0] = ARIES_MM_RD_WIDE_REG_2B; > > + break; > > + case 3: > > + buf[0] = ARIES_MM_RD_WIDE_REG_3B; > > + break; > > + case 4: > > + buf[0] = ARIES_MM_RD_WIDE_REG_4B; > > + break; > > + case 5: > > + buf[0] = ARIES_MM_RD_WIDE_REG_5B; > > + break; > > + default: > > + return -EINVAL; > > + } > > + ret = pt516xx_write_block_data(data, ARIES_MM_ASSIST_CMD_OFFSET, > > + 1, buf); > > + if (ret) > > + return ret; > > + > > + // Check access status > > + status = 0xff; > > + for (tries = 0; tries < 100; tries++) { > > + ret = pt516xx_read_block_data( > > + data, ARIES_MM_ASSIST_CMD_OFFSET, 1, &status); > > + if (ret) > > + return ret; > > + > > + if (status == 0) > > + break; > > + > > + usleep_range(ARIES_MM_STATUS_TIME, > > + ARIES_MM_STATUS_TIME + 1000); > > + } > > + if (status != 0) > > + return -ETIMEDOUT; > > + > > + // Read N bytes of data based on width > > + ret = pt516xx_read_block_data( > > + data, ARIES_MM_ASSIST_SPARE_3_OFFSET, width, val); > > + if (ret) > > + return ret; > > + } else { > > + return pt516xx_read_block_data(data, address, width, val); > > + } > > + > > + return 0; > > +} > > + > > +/* > > + * Read multiple (up to eight) data bytes from micro SRAM over I2C > > + */ > > +static int > > +pt516xx_read_block_data_main_micro_indirect(struct pt516xx_data *data, > > + u32 address, u8 len, u8 *val) > > +{ > > + int ret, tries; > > + u8 buf[8]; > > + u8 i, status; > > + u32 uind_offs = ARIES_MM_ASSIST_REG_ADDR_OFFSET; > > + u32 eeprom_base, eeprom_addr; > > + > > + // No multi-byte indirect support here. Hence read a byte at a time > > + eeprom_base = address - AL_MAIN_SRAM_DMEM_OFFSET; > > + for (i = 0; i < len; i++) { > > + // Write eeprom addr > > + eeprom_addr = eeprom_base + i; > > + buf[0] = eeprom_addr & 0xff; > > + buf[1] = (eeprom_addr >> 8) & 0xff; > > + buf[2] = (eeprom_addr >> 16) & 0xff; > > + ret = pt516xx_write_block_data(data, uind_offs, 3, buf); > > + if (ret) > > + return ret; > > + > > + // Write eeprom cmd > > + buf[0] = AL_TG_RD_LOC_IND_SRAM; > > + ret = pt516xx_write_block_data(data, uind_offs + 4, 1, buf); > > + if (ret) > > + return ret; > > + > > + // Test successful access > > + status = 0xff; > > + for (tries = 0; tries < 255; tries++) { > > + ret = pt516xx_read_block_data(data, uind_offs + 4, 1, > > + &status); > > + if (ret) > > + return ret; > > + > > + if (status == 0) > > + break; > > + } > > + if (status != 0) > > + return -ETIMEDOUT; > > + > > + ret = pt516xx_read_block_data(data, uind_offs + 3, 1, buf); > > + if (ret) > > + return ret; > > + > > + val[i] = buf[0]; > > + } > > + > > + return 0; > > +} > > + > > +/* > > + * Check the status of firmware > > + */ > > +static int pt516xx_fwsts_check(struct pt516xx_data *data) > > +{ > > + int ret, tries; > > + u8 buf[8]; > > + u8 heartbeat, major = 0, minor = 0; > > + u16 build = 0; > > + bool hb_changed = false; > > + > > + // Read Code Load reg > > + ret = pt516xx_read_block_data(data, ARIES_CODE_LOAD_REG, 1, buf); > > + if (ret) > > + return ret; > > + > > + if (buf[0] < ARIES_LOAD_CODE) { > > + dev_warn( > > + &data->client->dev, > > + "Code Load reg unexpected. Not all modules are loaded %x\n", > > + buf[0]); > > + data->code_load_okay = false; > > + } else { > > + data->code_load_okay = true; > > + } > > + > > + // Check Main Micro heartbeat > > + // If heartbeat value does not change for 100 tries, no MM heartbeat > > + // Else heartbeat present even if one value changes > > + ret = pt516xx_read_block_data(data, ARIES_MM_HEARTBEAT_ADDR, 1, buf); > > + if (ret) > > + return ret; > > + > > + heartbeat = buf[0]; > > + for (tries = 0; tries < 100; tries++) { > > + ret = pt516xx_read_block_data(data, ARIES_MM_HEARTBEAT_ADDR, 1, > > + buf); > > + if (ret) > > + return ret; > > + > > + if (buf[0] != heartbeat) { > > + hb_changed = true; > > + break; > > + } > > + } > > + data->mm_heartbeat_okay = hb_changed; > > + > > + // Read FW version > > + // If heartbeat not there, set default FW values to 0.0.0 > > + // and return success > > + if (data->mm_heartbeat_okay) { > > + // Get FW version (major) > > + ret = pt516xx_read_block_data_main_micro_indirect( > > + data, > > + ARIES_MAIN_MICRO_FW_INFO + ARIES_MM_FW_VERSION_MAJOR, 1, > > + &major); > > + if (ret) > > + return ret; > > + > > + // Get FW version (minor) > > + ret = pt516xx_read_block_data_main_micro_indirect( > > + data, > > + ARIES_MAIN_MICRO_FW_INFO + ARIES_MM_FW_VERSION_MINOR, 1, > > + &minor); > > + if (ret) > > + return ret; > > + > > + // Get FW version (build) > > + ret = pt516xx_read_block_data_main_micro_indirect( > > + data, > > + ARIES_MAIN_MICRO_FW_INFO + ARIES_MM_FW_VERSION_BUILD, 2, > > + (u8 *)&build); > > + if (ret) > > + return ret; > > + } else { > > + dev_warn(&data->client->dev, "No Main Micro Heartbeat\n"); > > This and other messages above would create a lot of noise if persistent > since the function is called repeatedly. > OK. I will reduce the debug messages or change to use dev_dbg(). > > + } > > + data->fw_ver.major = major; > > + data->fw_ver.minor = minor; > > + data->fw_ver.build = build; > > + > > + return 0; > > +} > > + > > +static int pt516xx_fw_is_at_least(struct pt516xx_data *data, u8 major, u8 minor, > > + u16 build) > > +{ > > + u32 ver = major << 24 | minor << 16 | build; > > + u32 curr_ver = data->fw_ver.major << 24 | data->fw_ver.minor << 16 | > > + data->fw_ver.build; > > + > > + if (curr_ver >= ver) > > + return true; > > + > > + return false; > > +} > > + > > +static int pt516xx_init_dev(struct pt516xx_data *data) > > +{ > > + int ret; > > + > > + mutex_lock(&data->lock); > > + ret = pt516xx_fwsts_check(data); > > + mutex_unlock(&data->lock); > > + if (ret) > > + return ret; > > + > > + dev_info(&data->client->dev, "fw_ver: %u.%u.%u\n", data->fw_ver.major, > > + data->fw_ver.minor, data->fw_ver.build); > > + > > Please no such noise > OK. > > + if (pt516xx_fw_is_at_least(data, 2, 2, 0)) > > + data->mm_wide_reg_valid = true; > > + > > + data->temp_cal_code_avg = ARIES_TEMP_CAL_CODE_DEFAULT; > > What is the point of this ? temp_cal_code_avg will never be anything else. > I will change to directly use ARIES_TEMP_CAL_CODE_DEFAULT when calculating. > > + data->init_done = true; > > + > > + return 0; > > +} > > + > > +static int pt516xx_read(struct device *dev, enum hwmon_sensor_types type, > > + u32 attr, int channel, long *val) > > +{ > > + struct pt516xx_data *data = dev_get_drvdata(dev); > > + int ret = 0; > > + long adc_code = 0; > > + > > This assumes a specific endianness which isn't given. I am quite sure that the > code as written won't work in a big endian system. Also, "long" may be 4 or 8 byte > depending on the CPU architecture. > Oops, I will revise it to be compatible with the big endian system. Using long is just to match the prototype of read(): "long *val". > > + if (!data->init_done) { > > + ret = pt516xx_init_dev(data); > > + if (ret) { > > + dev_err(dev, "pt516xx_init_dev failed %d\n", ret); > > + return ret; > > + } > > + } > > This is really the wrong place for this code. It should be checked and initialized > in the probe function, and probe should bail out if it fails. > > Also, the only reason to call pt516xx_init_dev() seems to be to determine > if "wide reegister access" is supported. If it isn't, pt516xx_read_block_data() > is used. Why not call pt516xx_read_block_data() below directly and not bother > with wide register access ? > Previously I followed CSDK using wide_reg to access new FW. Also tried old FW, but old FW only support read_block_data. After further testing with the new FW, it also does support read_block_data. So I will revise it to use read_block_data directly for temperature. Thanks. > > + > > + switch (attr) { > > + case hwmon_temp_input: > > + mutex_lock(&data->lock); > > + ret = pt516xx_read_wide_reg(data, > > + ARIES_CURRENT_AVG_TEMP_ADC_CSR, 4, > > + (u8 *)&adc_code); > > + mutex_unlock(&data->lock); > > + break; > > + default: > > + return -EOPNOTSUPP; > > + } > > + if (ret) { > > + dev_err(dev, "Read adc_code failed %d\n", ret); > > + return ret; > > + } The error return has been checked here. > > + if (adc_code == 0 || adc_code >= 0x3ff) { > > + dev_err(dev, "Invalid adc_code %lx\n", adc_code); > > + return -EINVAL; > OK. I will change to use -ENODATA here. > I guess this is supposed to cover an error return from pt516xx_read_wide_reg() > returned an error, assuming that adc_code was not overwritten in that case. > This is inappropriate. Error returns from pt516xx_read_wide_reg() should be > handled explicitly (and don't indicate "Invalid argument"). > > > + } > > + > > The above will create logging noise if the chip has a problem. Please make it dev_dbg. > OK. > > + *val = 110000 + ((adc_code - (data->temp_cal_code_avg + 250)) * -320); > > + > > + return 0; > > +} > > + > > +static umode_t pt516xx_is_visible(const void *data, > > + enum hwmon_sensor_types type, u32 attr, > > + int channel) > > +{ > > + switch (attr) { > > + case hwmon_temp_input: > > + return 0444; > > + default: > > + break; > > + } > > + > > + return 0; > > +} > > + > > +static const struct hwmon_channel_info *pt516xx_info[] = { > > + HWMON_CHANNEL_INFO(temp, HWMON_T_INPUT), > > + NULL > > +}; > > + > > +static const struct hwmon_ops pt516xx_hwmon_ops = { > > + .is_visible = pt516xx_is_visible, > > + .read = pt516xx_read, > > +}; > > + > > +static const struct hwmon_chip_info pt516xx_chip_info = { > > + .ops = &pt516xx_hwmon_ops, > > + .info = pt516xx_info, > > +}; > > + > > +static ssize_t pt516xx_debugfs_read_fw_ver(struct file *file, char __user *buf, > > + size_t count, loff_t *ppos) > > +{ > > + struct pt516xx_data *data = file->private_data; > > + int ret; > > + char ver[32]; > > + > > + mutex_lock(&data->lock); > > + ret = pt516xx_fwsts_check(data); > > + mutex_unlock(&data->lock); > > What is the point of (re-)reading the firmware version repeatedly ? > The PCIe retimer uses normal power rail, that is, it's turned off when the system is powered-off. Consider that the PCIe retimer is used on a server system, and monitored by a BMC. (an independent management controller, with embedded Linux) The BMC is still working when the system is powered-off but remains standby power rail. When the retimer FW is updated and reload, it can read the correct FW version. > > + if (ret) > > + return ret; > > + > > + ret = snprintf(ver, sizeof(ver), "%u.%u.%u\n", data->fw_ver.major, > > + data->fw_ver.minor, data->fw_ver.build); > > + if (ret < 0) > > + return ret; > > + > > + return simple_read_from_buffer(buf, count, ppos, ver, ret + 1); > > +} > > + > > +static const struct file_operations pt516xx_debugfs_ops_fw_ver = { > > + .read = pt516xx_debugfs_read_fw_ver, > > + .open = simple_open, > > +}; > > + > > +static ssize_t pt516xx_debugfs_read_health(struct file *file, char __user *buf, > > + size_t count, loff_t *ppos) > > +{ > > + struct pt516xx_data *data = file->private_data; > > + int ret; > > + u8 status = 0; > > + char health[8]; > > + > > + mutex_lock(&data->lock); > > + ret = pt516xx_fwsts_check(data); > > + mutex_unlock(&data->lock); > > + if (ret == 0) { > > + status |= data->mm_heartbeat_okay ? 1 : 0; // bit0 > > + status |= data->code_load_okay ? 2 : 0; // bit1 > > Those should really be separate debugfs files. > OK. > > + } > > + > > + ret = snprintf(health, sizeof(health), "0x%02x\n", status); > > + if (ret < 0) > > + return ret; > > + > > + return simple_read_from_buffer(buf, count, ppos, health, ret + 1); > > +} > > + > > +static const struct file_operations pt516xx_debugfs_ops_health = { > > + .read = pt516xx_debugfs_read_health, > > + .open = simple_open, > > +}; > > + > > +static int pt516xx_init_debugfs(struct pt516xx_data *data) > > +{ > > + if (!pt516xx_debugfs_dir) > > + return -ENOENT; > > + > > debugfs functions handle this situation, and return values therefore do not > need to be checked. > Got it. I will remove it. > > + data->debugfs = debugfs_create_dir(dev_name(&data->client->dev), > > + pt516xx_debugfs_dir); > > + if (IS_ERR_OR_NULL(data->debugfs)) > > + return -ENOENT; > > + > Again, unnecessary. > Got it. > > + debugfs_create_file("fw_ver", 0444, data->debugfs, data, > > + &pt516xx_debugfs_ops_fw_ver); > > + > > + debugfs_create_file("health", 0444, data->debugfs, data, > > + &pt516xx_debugfs_ops_health); > > + > > + return 0; > > +} > > + > > +static int pt516xx_probe(struct i2c_client *client) > > +{ > > + struct device *dev = &client->dev; > > + struct device *hwmon_dev; > > + struct pt516xx_data *data; > > + > > + data = devm_kzalloc(dev, sizeof(struct pt516xx_data), GFP_KERNEL); > > + if (!data) > > + return -ENOMEM; > > + > > + data->client = client; > > + mutex_init(&data->lock); > > + pt516xx_init_dev(data); > > + > > + hwmon_dev = devm_hwmon_device_register_with_info( > > + dev, client->name, data, &pt516xx_chip_info, NULL); > > + > > + if (pt516xx_init_debugfs(data)) > > + dev_warn(dev, "Failed to register debugfs\n"); > > debugfs should fail silently. > > debugfs files are not removed if a single device is unloaded. Did you check > what happens in this case ? > Oops, I will add the remove callback function to remove the debugfs files. > > + > > + return PTR_ERR_OR_ZERO(hwmon_dev); > > +} > > + > > +static const struct i2c_device_id pt516xx_id[] = { > > + { "pt5161l", 0 }, > > + {} > > +}; > > +MODULE_DEVICE_TABLE(i2c, pt516xx_id); > > + > > +static const struct of_device_id __maybe_unused pt516xx_of_match[] = { > > + { .compatible = "asteralabs,pt5161l" }, > > + {}, > > +}; > > +MODULE_DEVICE_TABLE(of, pt516xx_of_match); > > + > > +static struct i2c_driver pt516xx_driver = { > > + .class = I2C_CLASS_HWMON, > > + .driver = { > > + .name = "pt516xx", > > + .of_match_table = of_match_ptr(pt516xx_of_match), > > This means the driver won't instantiate on an ACPI system if devicetree > support is disabled. Is this intentional ? > OK. I will add the acpi_match_table. > > + }, > > + .probe = pt516xx_probe, > > + .id_table = pt516xx_id, > > +}; > > + > > +module_i2c_driver(pt516xx_driver); > > + > > +static int __init pt516xx_core_init(void) > > +{ > > + pt516xx_debugfs_dir = debugfs_create_dir("pt516xx", NULL); > > + if (IS_ERR(pt516xx_debugfs_dir)) > > + pt516xx_debugfs_dir = NULL; > > + > > + return 0; > > +} > > + > > +static void __exit pt516xx_core_exit(void) > > +{ > > + debugfs_remove_recursive(pt516xx_debugfs_dir); > > +} > > + > > +module_init(pt516xx_core_init); > > +module_exit(pt516xx_core_exit); > > + > > +MODULE_AUTHOR("Cosmo Chou <cosmo.chou@xxxxxxxxxxxx>"); > > +MODULE_DESCRIPTION("Hwmon driver for Astera Labs Aries PCIe retimer"); > > +MODULE_LICENSE("GPL"); >