Re: [PATCH 2/3] hwmon: Add support for ltc2947

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2019-10-07 at 05:44 -0700, Guenter Roeck wrote:
> 
> On 10/7/19 5:25 AM, Sa, Nuno wrote:
> > On Fri, 2019-10-04 at 08:06 -0700, Guenter Roeck wrote:
> > > On Fri, Oct 04, 2019 at 07:45:07AM +0000, Sa, Nuno wrote:
> > > [ ... ]
> > > > > > +static int ltc2947_val_read(struct ltc2947_data *st, const
> > > > > > u8
> > > > > > reg,
> > > > > > +			    const u8 page, const size_t size,
> > > > > > s64 *val)
> > > > > > +{
> > > > > > +	int ret;
> > > > > > +	u64 __val = 0;
> > > > > > +
> > > > > > +	mutex_lock(&st->lock);
> > > > > > +
> > > 
> > > On a side note, suspend code is supposed to be atomic,
> > > If this lock is held, the process or thread holding it
> > > will likely stall suspend since it won't run.
> > > At the very least, this would have to be a trylock,
> > > with suspend failing if the lock can not be acquired.
> > 
> > Got it. Even more, Now I don't see the point of having the mutex in
> > the
> > PM callbacks (though I saw other drivers doing this). As you said,
> > at
> > the very least the trylock should be used since a frozen task might
> > have the mutex locked. Either way, I will drop both the flag and
> > the
> > call to mutex_* functions is suspend()/resume().
> > 
> > > > > > +	if (st->reset) {
> > > > > > +		mutex_unlock(&st->lock);
> > > > > > +		return -EPERM;
> > > > > 
> > > > > Not sure what the error here should be, but EPERM is not
> > > > > correct.
> > > > > 
> > > > > Under which conditions would this function be called while in
> > > > > suspend ?
> > > > 
> > > > Honestly, this is more like a sanity check. I'm not sure if we
> > > > can
> > > > get
> > > > here in suspend mode. Don't userland apps can still run in
> > > > suspend?
> > > > I
> > > > guess so but I'm not 100% sure on this. Do you have any
> > > > recommendation
> > > > for the error here?
> > > > 
> > > Sorry, I won't accept "just in case" code.
> > > 
> > > Having said that, please inform yourself how suspend works.
> > > Userland
> > > code
> > > has to be stopped before any hardware can be disabled. Similar,
> > > hardware
> > > has to be re-enabled before userland code can resume.
> > > Otherwise the kernel would crash all over the place. In many
> > > cases,
> > > disabling the hardware means that trying to access hardware
> > > registers
> > > will cause an acess fault.
> > 
> > Yes, you are right. I did checked the suspend code and all userland
> > tasks and kthreads are stopped before the suspend() callback is
> > called
> > for the HW devices.
> > 
> > > [...]
> > > 
> > > > > > +
> > > > > > +static struct attribute *ltc2947_attrs[] = {
> > > > > > +	&sensor_dev_attr_in0_fault.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_in1_fault.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_curr1_fault.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_temp1_fault.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_power1_input.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_power1_max.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_power1_min.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_power1_input_highest.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_power1_input_lowest.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_power1_fault.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy1_input.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy1_max.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy1_min.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy1_max_alarm.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy1_min_alarm.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_input.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_max.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_min.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_max_alarm.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_min_alarm.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy1_overflow_alarm.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_overflow_alarm.dev_attr.attr,
> > > > 
> > > > These overflow attributes are kind of an alarm for the energy
> > > > ones.
> > > > It
> > > > tells that the energy registers are about to overflow. I guess
> > > > that
> > > > some application can easily find out the maximum values
> > > > supported
> > > > on
> > > > these registers and implement whatever logic they want in the
> > > > app
> > > > itself. So, if you prefer I can just drop this ones?
> > > > 
> > > I understand the overflow use case. However, the mere presence
> > > of min/max attributes for energy attributes suggests that this
> > > is not the min/max use case for hwmon attributes. There is no
> > > "minimum"
> > > or "maximum" energy for an accumulating value. Such attributes
> > > only make sense in an application abler to measure available
> > > energy (eg a battery controller). I'll have to read the chip
> > > specification to understand the intended use case.
> > 
> > Should I just drop the overflow attributes? I think the part can be
> > used to check battery charging efficiency for example (though I
> > guess
> > we would need to also support the Charge register's).
> > 
> 
> If chip is (or can be) used as charger, it should register as such.
> Note my question was the energy limit attributes, not the overflow
> attributes.

I don't think it can be used as a charger but it can also measure
charge (integrating the measured current over time). As they are not
standard attributes I did not included this on the driver (I sent a
query on this before starting the driver - 
https://marc.info/?l=linux-hwmon&m=156507711612877&w=2).

I see your point about energy and having maximum and minimum for an
accumulated value. Honestly, looking at the chip specification I cannot
see the intended use case for this. Maybe for monitoring/characterizing
batteries since there are some controls on these accumulated values (we
can set the part to accumulate only when current is positive for
example).
I will do some internal querying to see if I can find out more on this.

> > > > > > +	&sensor_dev_attr_energy1_fault.dev_attr.attr,
> > > > > > +	&sensor_dev_attr_energy2_fault.dev_attr.attr,
> > > > > 
> > > > > Some of those are non-standard attributes. You'll have
> > > > > to explain each in detail, especially why it makes sense
> > > > > to provide such attributes to the user and why you can't
> > > > > use standard attributes instead. Also, for the _fault
> > > > > attributes, I don't entirely see the point. If the fault bit
> > > > > is set, ADC readings are not valid because supply voltage
> > > > > is low. This means that ADC register reads will be invalid.
> > > > > What is the point of having a non-standard attribute - which
> > > > > likely will be ignored - instead of returning an error when
> > > > > an attempt is made to read an ADC value ?
> > > > 
> > > > I was also not sure on this *_fault attributes. They are there
> > > > to
> > > > tell
> > > > that the readings are invalid. Now that I think about it, I'm
> > > > not
> > > > sure
> > > > if it even makes sense to return error if this bit is set. The
> > > > part
> > > > is
> > > > in continuous mode so, it might happen that we have the fault
> > > > bit
> > > > set
> > > > for a short time but afterwards things go normal and the bit
> > > > will
> > > > still
> > > > be set until we read it. So my point is, we might be returning
> > > > error
> > > > for a conversion that happened way before our current reading.
> > > > Any
> > > > suggestion here? Would you be fine if I just drop this
> > > > attributes?
> > > >    
> > > 
> > > It sounds like "fault" means something like "one of the past
> > > readings
> > > was wrong, but I don't know which one and I don't know if the
> > > wrong
> > > value was ever read by user space". Sorry, I fail to see what
> > > value
> > > those attributes are supposed to have for the user. At best it
> > > could
> > > mean "please re-read the associated attrribute", but that could
> > > as
> > > well
> > > be accomplished without userspace action if it is really needed.
> > > Also, per datasheet, it looks like the fault bit is set of the
> > > chip
> > > voltage is too low. If that happens, the system has a severe
> > > problem
> > > which can not be resolved with an attribute visible to userspace.
> > 
> > I will drop the fault attributes.
> > 
> > > > > Others, like energy1_input, or most of the power attributes,
> > > > > are standard attributes. Please explain the reasoning for
> > > > > not using the standard API for those.
> > > > 
> > > > This ones were because we need 64bit variables. For energy, the
> > > > part
> > > > also supports the alarms, max and min attributes so I included
> > > > them.
> > > >   
> > > I can to some degree the logic for energy attributes, but do you
> > > really
> > > have an application where the chip is used on a 32-bit system and
> > > monitors power larger than 2kW ?
> > 
> > Hmm, I looked again at the chip specification and unless I'm
> > missing
> > something obvious the part can only measure +/- 30A and 0-15V
> > giving us
> > +/- 450W which definitely fits in a long variable. The only thing
> > that
> > will be truncated is the min/max values. The part, by default, has
> > this
> > value to 0x7fff and 0x8000 which times 200000uW (part scale) will
> > be
> > truncated. Now, we can argue that this max/min values are not real
> > and
> > the user is expected to write this attributes with some meaningful
> > values? How do you suggest to proceed? Should I just use standard
> > attributes for power?
> >   
> How about detecting the overflow on read and just report the maximum
> supported value ? Or, alternatively, initialize the register with the
> maximum supported value.

Sounds good. I will initialize the register's on the setup() phase.

> Guenter





[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux