On 12/12/23 07:30, Nuno Sá wrote:
On Tue, 2023-12-12 at 07:14 -0800, Guenter Roeck wrote:
On 12/12/23 06:28, Nuno Sá wrote:
On Mon, 2023-12-11 at 07:36 -0800, Guenter Roeck wrote:
On Tue, Dec 05, 2023 at 04:22:56PM +0100, Nuno Sa via B4 Relay wrote:
From: Nuno Sa <nuno.sa@xxxxxxxxxx>
The LTC4282 hot swap controller allows a board to be safely inserted and
removed from a live backplane. Using one or more external N-channel pass
transistors, board supply voltage and inrush current are ramped up at an
adjustable rate. An I2C interface and onboard ADC allows for monitoring
of board current, voltage, power, energy and fault status.
Signed-off-by: Nuno Sa <nuno.sa@xxxxxxxxxx>
---
+
+power1_good Power considered good
I really don't like this attribute. Like the ones below it is non-standard
and invisible for standard applications. On top of that, I think it isn't
really related to "power" but to the output voltage. What does it actually
report that isn't included in the FET faults ?
This is detected with the FB pin and a voltage divider (from the output
voltage). Basically depending on the level of that pin, the chip indicate
power
good or power bad. I was also very reluctant with this attribute (I mention
it
in the v1 cover). This might not even indicate any misbehave. We also
support
reporting this using the gpio1 pin (if we set it that way). So, I guess I
can
just drop this one and add support for it if we ever have a real usecase
where I
can actually justify having it :).
We already have the power_bad fault log in debugfs so I'm not sure if adding
this one there adds much value.
+fet_short_fault FET short alarm
+fet_bad_fault FET bad alarm
Those attributes have little value since they are not standard attributes
and won't be seen by standard applications. On top of that, it is not
clear
(not documented) what the attribute actually reports. I assume it is
associated with the output voltage, i.e., in0, but that is just an
assumption.
fet_short - This is one is detected if the ADC measures a current sense
voltage
0.25mv while the fet gate is off.
fet_bad - Is set by monitoring the voltage at the gate and the drain to
source
voltage.
These ones might indicate real issues with the HW so I thought they could be
important...
What do you think about introducing a standard inX_fault attribute ?
It would not be as specific as short/bad, but I think it would be more
useful and we could add it to the ABI.
It would be better than nothing. And we do have fault logs for both these
failures so userspace could also use that to know exactly what was the
issue. If
that's ok with you, I would then report this in inX_fault? Did you had in
mind
putting this in in0 (vsource) or adding a new channel?
In my first draft I had another voltage channel (label: VFET) to report the
fet_bad condition. I was using the inX_crit or inX_lcrit but it felt bad so
I
removed it...
This isn't an extra voltage channel. It should be associated with the output
voltage because that is what is affected, and that would be in0.
Noted... will go with that and add an introductory patch for having inX_fault in
hwmon.
+fault_logs_reset Clears all the Logged Faults
What exactly does that do that is user visible ?
Well, this one is because in some configurations the chip won't enable the
output load until you reset/clear the fault log keeping it from enabling the
output. This is the comment I have in the code:
"Fault log failures. These faults might be important in systems where auto-
retry
is not enabled since they will cause the part to latch off until they are
cleared. Typically that happens when the system admin is close enough so he
can
check what happened and manually clear the faults. Moreover, manually
clearing
the faults might only matter when ON_FAULT_MASK in the CONTROL register is
set
(which is the default) as in that case, a turn off signal from the ON pin
won't
clear them."
In v1 I was allowing to clear fauls log individually and you recommended to
have
an attribute to clear them all at once as that would simplify things.
I just kept it in here because this might be important for the chip to work
as
expected again so having it in debugfs might be weird.
How about using a write to in0_fault to clear the log ?
But that would be just related with the output voltage. You can also have
failures on VDD (over and undervolatge fault - I'm using in1_crit and in1_lcrit)
and if you have the auto retry bit disabled, then clearing fault logs might be
important. This attribute is nice because makes it clear what will be done.
OTOH, I do understand your worries with non standard ABI...
Depends on the meaning of "fault". PMBus, for example, uses the "fault" terminology
to indicate that critical limits have been exceeded. That isn't a "fault" in the
terminology used by the hwmon subsystem. Violating a voltage/current/power range
isn't a fault. A shorted FET or temperature sensor, however, is a fault.
In that sense, if the output is turned off because a voltage / current / power
limit was violated, a write into the respective alarm register would be appropriate
to reset it if the condition doesn't auto-reset. It really all depends on the context.
I'm not sure how usual is for fault logs to be around that would justify for a
global standard attribute.
We could also use in[01]_reset_history. While that is originally intended to reset
min/max voltage history, I think it makes sense to extend the meaning to include
fault history (even more so if that history includes over/undervoltage events).
Plus, there are other _reset_history attributes which could be used to reset
power/current/temperature history separately if that is supported by the chip.
Thanks,
Guenter