On Mon, 2019-10-07 at 05:44 -0700, Guenter Roeck wrote: > > On 10/7/19 5:25 AM, Sa, Nuno wrote: > > On Fri, 2019-10-04 at 08:06 -0700, Guenter Roeck wrote: > > > On Fri, Oct 04, 2019 at 07:45:07AM +0000, Sa, Nuno wrote: > > > [ ... ] > > > > > > +static int ltc2947_val_read(struct ltc2947_data *st, const > > > > > > u8 > > > > > > reg, > > > > > > + const u8 page, const size_t size, > > > > > > s64 *val) > > > > > > +{ > > > > > > + int ret; > > > > > > + u64 __val = 0; > > > > > > + > > > > > > + mutex_lock(&st->lock); > > > > > > + > > > > > > On a side note, suspend code is supposed to be atomic, > > > If this lock is held, the process or thread holding it > > > will likely stall suspend since it won't run. > > > At the very least, this would have to be a trylock, > > > with suspend failing if the lock can not be acquired. > > > > Got it. Even more, Now I don't see the point of having the mutex in > > the > > PM callbacks (though I saw other drivers doing this). As you said, > > at > > the very least the trylock should be used since a frozen task might > > have the mutex locked. Either way, I will drop both the flag and > > the > > call to mutex_* functions is suspend()/resume(). > > > > > > > > + if (st->reset) { > > > > > > + mutex_unlock(&st->lock); > > > > > > + return -EPERM; > > > > > > > > > > Not sure what the error here should be, but EPERM is not > > > > > correct. > > > > > > > > > > Under which conditions would this function be called while in > > > > > suspend ? > > > > > > > > Honestly, this is more like a sanity check. I'm not sure if we > > > > can > > > > get > > > > here in suspend mode. Don't userland apps can still run in > > > > suspend? > > > > I > > > > guess so but I'm not 100% sure on this. Do you have any > > > > recommendation > > > > for the error here? > > > > > > > Sorry, I won't accept "just in case" code. > > > > > > Having said that, please inform yourself how suspend works. > > > Userland > > > code > > > has to be stopped before any hardware can be disabled. Similar, > > > hardware > > > has to be re-enabled before userland code can resume. > > > Otherwise the kernel would crash all over the place. In many > > > cases, > > > disabling the hardware means that trying to access hardware > > > registers > > > will cause an acess fault. > > > > Yes, you are right. I did checked the suspend code and all userland > > tasks and kthreads are stopped before the suspend() callback is > > called > > for the HW devices. > > > > > [...] > > > > > > > > > + > > > > > > +static struct attribute *ltc2947_attrs[] = { > > > > > > + &sensor_dev_attr_in0_fault.dev_attr.attr, > > > > > > + &sensor_dev_attr_in1_fault.dev_attr.attr, > > > > > > + &sensor_dev_attr_curr1_fault.dev_attr.attr, > > > > > > + &sensor_dev_attr_temp1_fault.dev_attr.attr, > > > > > > + &sensor_dev_attr_power1_input.dev_attr.attr, > > > > > > + &sensor_dev_attr_power1_max.dev_attr.attr, > > > > > > + &sensor_dev_attr_power1_min.dev_attr.attr, > > > > > > + &sensor_dev_attr_power1_input_highest.dev_attr.attr, > > > > > > + &sensor_dev_attr_power1_input_lowest.dev_attr.attr, > > > > > > + &sensor_dev_attr_power1_fault.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy1_input.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy1_max.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy1_min.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy1_max_alarm.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy1_min_alarm.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_input.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_max.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_min.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_max_alarm.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_min_alarm.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy1_overflow_alarm.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_overflow_alarm.dev_attr.attr, > > > > > > > > These overflow attributes are kind of an alarm for the energy > > > > ones. > > > > It > > > > tells that the energy registers are about to overflow. I guess > > > > that > > > > some application can easily find out the maximum values > > > > supported > > > > on > > > > these registers and implement whatever logic they want in the > > > > app > > > > itself. So, if you prefer I can just drop this ones? > > > > > > > I understand the overflow use case. However, the mere presence > > > of min/max attributes for energy attributes suggests that this > > > is not the min/max use case for hwmon attributes. There is no > > > "minimum" > > > or "maximum" energy for an accumulating value. Such attributes > > > only make sense in an application abler to measure available > > > energy (eg a battery controller). I'll have to read the chip > > > specification to understand the intended use case. > > > > Should I just drop the overflow attributes? I think the part can be > > used to check battery charging efficiency for example (though I > > guess > > we would need to also support the Charge register's). > > > > If chip is (or can be) used as charger, it should register as such. > Note my question was the energy limit attributes, not the overflow > attributes. I don't think it can be used as a charger but it can also measure charge (integrating the measured current over time). As they are not standard attributes I did not included this on the driver (I sent a query on this before starting the driver - https://marc.info/?l=linux-hwmon&m=156507711612877&w=2). I see your point about energy and having maximum and minimum for an accumulated value. Honestly, looking at the chip specification I cannot see the intended use case for this. Maybe for monitoring/characterizing batteries since there are some controls on these accumulated values (we can set the part to accumulate only when current is positive for example). I will do some internal querying to see if I can find out more on this. > > > > > > + &sensor_dev_attr_energy1_fault.dev_attr.attr, > > > > > > + &sensor_dev_attr_energy2_fault.dev_attr.attr, > > > > > > > > > > Some of those are non-standard attributes. You'll have > > > > > to explain each in detail, especially why it makes sense > > > > > to provide such attributes to the user and why you can't > > > > > use standard attributes instead. Also, for the _fault > > > > > attributes, I don't entirely see the point. If the fault bit > > > > > is set, ADC readings are not valid because supply voltage > > > > > is low. This means that ADC register reads will be invalid. > > > > > What is the point of having a non-standard attribute - which > > > > > likely will be ignored - instead of returning an error when > > > > > an attempt is made to read an ADC value ? > > > > > > > > I was also not sure on this *_fault attributes. They are there > > > > to > > > > tell > > > > that the readings are invalid. Now that I think about it, I'm > > > > not > > > > sure > > > > if it even makes sense to return error if this bit is set. The > > > > part > > > > is > > > > in continuous mode so, it might happen that we have the fault > > > > bit > > > > set > > > > for a short time but afterwards things go normal and the bit > > > > will > > > > still > > > > be set until we read it. So my point is, we might be returning > > > > error > > > > for a conversion that happened way before our current reading. > > > > Any > > > > suggestion here? Would you be fine if I just drop this > > > > attributes? > > > > > > > > > > It sounds like "fault" means something like "one of the past > > > readings > > > was wrong, but I don't know which one and I don't know if the > > > wrong > > > value was ever read by user space". Sorry, I fail to see what > > > value > > > those attributes are supposed to have for the user. At best it > > > could > > > mean "please re-read the associated attrribute", but that could > > > as > > > well > > > be accomplished without userspace action if it is really needed. > > > Also, per datasheet, it looks like the fault bit is set of the > > > chip > > > voltage is too low. If that happens, the system has a severe > > > problem > > > which can not be resolved with an attribute visible to userspace. > > > > I will drop the fault attributes. > > > > > > > Others, like energy1_input, or most of the power attributes, > > > > > are standard attributes. Please explain the reasoning for > > > > > not using the standard API for those. > > > > > > > > This ones were because we need 64bit variables. For energy, the > > > > part > > > > also supports the alarms, max and min attributes so I included > > > > them. > > > > > > > I can to some degree the logic for energy attributes, but do you > > > really > > > have an application where the chip is used on a 32-bit system and > > > monitors power larger than 2kW ? > > > > Hmm, I looked again at the chip specification and unless I'm > > missing > > something obvious the part can only measure +/- 30A and 0-15V > > giving us > > +/- 450W which definitely fits in a long variable. The only thing > > that > > will be truncated is the min/max values. The part, by default, has > > this > > value to 0x7fff and 0x8000 which times 200000uW (part scale) will > > be > > truncated. Now, we can argue that this max/min values are not real > > and > > the user is expected to write this attributes with some meaningful > > values? How do you suggest to proceed? Should I just use standard > > attributes for power? > > > How about detecting the overflow on read and just report the maximum > supported value ? Or, alternatively, initialize the register with the > maximum supported value. Sounds good. I will initialize the register's on the setup() phase. > Guenter