śr., 4 wrz 2024 o 16:46 Guenter Roeck <linux@xxxxxxxxxxxx> napisał(a): > > On 9/3/24 23:51, Patryk wrote: > > Hi > > I'm trying to bring-up LTC2977 and LTC2974 devices (I used DC1962CF > > demonstration system, it hosts both devices) on our board (NXP > > Layerscape basd) using an existing driver, namely LTC2978 ( and I > > faced some minor problems which I would like to clarify. > > The driver probed successfully for both devices, and various sysfs > > attributes have been created under */hwmonX/, however I would like to > > focus only on few of them, namely: > > - temp1_input: current temperature > > - temp1_max: max allowed temperature, any temp value above this > > setting will trigger a warning > > - temp1_max_alarm: boolean value indicating whether or not alarm > > conditions have occurred > > > > I wanted to test if everything works fine so I conducted the following > > test assuming that the temp1_input in my testing environment usually > > shows value around 38000: > > - I read the temp1_max_alarm using: cat temp1_max_alarm -> it showed "0" > > - I set temp1_max to 20000 > > - I read the temp1_max_alarm using: cat temp1_max_alarm -> it returned > > "cat: read error: No such device or address" > > It occurred only on LTC2977, never happend on LTC2974. > > I traced down what exactly happens when I issue this command and it > > seems that the target device, LTC2977 responds with NACK to one of the > > issued commands. But what is this command exactly? > > When one reads temp1_max_alarm the driver (pmbus_core in this case) > > does the following: > > - the driver reads STATUS_TEMPERATURE and if 6th bit in this register > > (Status_temperature_ot_warn) is set it continues with further commands > > - the driver reads READ_TEMPERATURE_1 > > - the driver reads OT_WARN_LIMIT > > - the driver updates the status register (STATUS_TEMPERATURE) with the > > same value that it previously read *(see_below) > > - the driver compares OT_WARN_LIMIT and READ_TEMPERATURE_1 and then it > > returns appropriate value (0 or 1 ) to userspace > > > > * this was added in 35f165f08950a876f1b95a61d79c93678fba2fd6 commit, > > and it seems to be compliant with PMBUS specification that says (PMBus > > Specification rev.1.3.1 part II, chapter 10.2.4): > > "Any or all of the bits in any status register except STATUS_BYTE and > > STATUS_WORD can be directly cleared by issuing the status command with one data > > byte that is written. The data byte is a binary value. A 1 in any bit > > position indicates > > that bit is to be cleared, if set, and unchanged if not set" > > Below is the simplified sequence of operations that are performed > > while reading temp1_max_alarm: > > - smbus_read: i2c-7 a=033 f=0004 c=7d BYTE_DATA /* read > > STATUS_TEMPERATURE, returns 0x40 */ > > - smbus_read: i2c-7 a=033 f=0004 c=8d WORD_DATA /* read READ_TEMPERATURE_1 */ > > - smbus_read: i2c-7 a=033 f=0004 c=51 WORD_DATA /* OT_WARN_LIMIT */ > > - smbus_write: i2c-7 a=033 f=0004 c=7d BYTE_DATA l=1 [40] /* write > > back status register to clear warn bit */ > > > > The last operation fails due to NACK received. > > So I'm wondering - considering that this "write back" operation takes > > place in the correct place, in the correct order and so on according > > to PMBUS specification, could it be that the device itself does not > > implement this correctly and simply responds with NACK to the write > > back operation to status register? > > On the other hand - why does it work correctly on LTC2974 but would > > not work on LTC2977? > > > > I would be grateful for any insights or guidance on resolving this issue. > > Hi, thanks for the response > Datasheets will tell you: The status registers are supposed to be read-only > on those chips. I'm aware of this. Actually that's why I asked because I was a bit confused when I noticed that the driver actually tries to write something to readonly register which didn't work on LTC2977, but on the other hand worked correctly with LTC2974. > We'll need to add some code to detect that condition and > refrain from clearing the status register if the chip doesn't support > writes (or maybe ignore errors from the clear operation). Ignoring the > error might be the easiest fix. I will apply this fix to our codebase then, unless I come up with a better idea. Thanks for clarification BR Patryk