On 9/3/24 23:51, Patryk wrote:
Hi I'm trying to bring-up LTC2977 and LTC2974 devices (I used DC1962CF demonstration system, it hosts both devices) on our board (NXP Layerscape basd) using an existing driver, namely LTC2978 ( and I faced some minor problems which I would like to clarify. The driver probed successfully for both devices, and various sysfs attributes have been created under */hwmonX/, however I would like to focus only on few of them, namely: - temp1_input: current temperature - temp1_max: max allowed temperature, any temp value above this setting will trigger a warning - temp1_max_alarm: boolean value indicating whether or not alarm conditions have occurred I wanted to test if everything works fine so I conducted the following test assuming that the temp1_input in my testing environment usually shows value around 38000: - I read the temp1_max_alarm using: cat temp1_max_alarm -> it showed "0" - I set temp1_max to 20000 - I read the temp1_max_alarm using: cat temp1_max_alarm -> it returned "cat: read error: No such device or address" It occurred only on LTC2977, never happend on LTC2974. I traced down what exactly happens when I issue this command and it seems that the target device, LTC2977 responds with NACK to one of the issued commands. But what is this command exactly? When one reads temp1_max_alarm the driver (pmbus_core in this case) does the following: - the driver reads STATUS_TEMPERATURE and if 6th bit in this register (Status_temperature_ot_warn) is set it continues with further commands - the driver reads READ_TEMPERATURE_1 - the driver reads OT_WARN_LIMIT - the driver updates the status register (STATUS_TEMPERATURE) with the same value that it previously read *(see_below) - the driver compares OT_WARN_LIMIT and READ_TEMPERATURE_1 and then it returns appropriate value (0 or 1 ) to userspace * this was added in 35f165f08950a876f1b95a61d79c93678fba2fd6 commit, and it seems to be compliant with PMBUS specification that says (PMBus Specification rev.1.3.1 part II, chapter 10.2.4): "Any or all of the bits in any status register except STATUS_BYTE and STATUS_WORD can be directly cleared by issuing the status command with one data byte that is written. The data byte is a binary value. A 1 in any bit position indicates that bit is to be cleared, if set, and unchanged if not set" Below is the simplified sequence of operations that are performed while reading temp1_max_alarm: - smbus_read: i2c-7 a=033 f=0004 c=7d BYTE_DATA /* read STATUS_TEMPERATURE, returns 0x40 */ - smbus_read: i2c-7 a=033 f=0004 c=8d WORD_DATA /* read READ_TEMPERATURE_1 */ - smbus_read: i2c-7 a=033 f=0004 c=51 WORD_DATA /* OT_WARN_LIMIT */ - smbus_write: i2c-7 a=033 f=0004 c=7d BYTE_DATA l=1 [40] /* write back status register to clear warn bit */ The last operation fails due to NACK received. So I'm wondering - considering that this "write back" operation takes place in the correct place, in the correct order and so on according to PMBUS specification, could it be that the device itself does not implement this correctly and simply responds with NACK to the write back operation to status register? On the other hand - why does it work correctly on LTC2974 but would not work on LTC2977? I would be grateful for any insights or guidance on resolving this issue.
Datasheets will tell you: The status registers are supposed to be read-only on those chips. We'll need to add some code to detect that condition and refrain from clearing the status register if the chip doesn't support writes (or maybe ignore errors from the clear operation). Ignoring the error might be the easiest fix. Note that the 2974 datasheet also states that the temperature register is read-only, so the chip behavior does not seem to match reality. Guenter