Re: (bug report) HWMON & Thermal interactions #forregzbot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker. CCing the regression
mailing list, as it should be in the loop for all regressions, as
explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

On 23.10.22 20:27, Cristian Marussi wrote:
> 
> Starting with v6.1-rc1 the SCMI HWMON driver failed probing on my JUNO due
> to the fact that no trip points were (ever !) defined in the DT; bisecting it
> looks like that after:

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced e51813313
#regzbot title SCMI HWMON driver failed probing on JUNO
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

> https://lore.kernel.org/all/20220804224349.1926752-28-daniel.lezcano@xxxxxxxxxx/
> 
> the presence of the mandatory trips node within thermal zones is now
> enforced.
> 
> So, this is NOT what this bug report is about (I'll post soon patches for
> the JUNO DT missing trips) BUT once this problem was solved in the DT,
> another issue appeared:
> 
> [    1.921929] hwmon hwmon0: temp2_input not attached to any thermal zone
> 
> that despite having now a goodi/valid DT describing 2 sensors and 2 thermal zones
> embedding that sensors, only the first one is found as belonging to one ThermZ.
> (this happens ALSO with v6.0 once I added the trips...)
> 
> Digging deep into this, it turned out that inside the call chain
> 
> devm_hwmon_device_register_with_info
>   hwmon_device_register_with_info
>     __hwmon_device_register
> 	hwmon_thermal_register_sensors(dev)
> 		--> hwmon_thermal_add_sensor(dev, j)
> 			--> devm_thermal_of_zone_register(dev, sensor_id, tdata, )
> 
> the HWMON channel index j is passed to the Thermal framework in order to
> search and bind sensors with defined thermal zone, but this lead to the
> assumption that sequential HWMON channel indexes corresponds one-to-one to the
> underlying real sensor IDs that the ThermalFramework uses for matching
> within the DT.
> 
> On a system like my SCMI-based DT where I have 2 temp-sensors bound to 2
> thermal zones like:
> 
> thernal_zones {
> 	pmic {
> 		...
> 		thermal-sensors = <&scmi_sensors0 0>;
> 		...
> 		trips {
> 			...
> 		}
> 	soc {
> 		...
> 		thermal-sensors = <&scmi_sensors0 3>;
> 		...
> 		trips {
> 			...
> 		}
> 	}
> }
> 
> This works fine by chance for the pmic (j=0, sensor_id=0) BUT cannot work for
> the soc where J=1 BUT the real sensor ID is 3.
> 
> Note that there can be a number of sensors, not all of them of a type handled
> by HWMON, and enumerated by SCMI in different ways depending on the
> platform.
> 
> I suppose this is not an SCMI-only related issue, but maybe in non-SCMI
> context, where sensors are purely defined in the DT, the solution can be
> more easily attained (i.e. renumber the sensors).
> 
> At first I tried to solve this inside scmi-hwmon.c BUT I could not find
> a way to present to the HWMON subsystem the list of sensors preserving
> the above index/sensor_id matching (not even with a hack like passing
> down dummy sensors to the HWMON subsystem to fill the 'holes' in the
> numbering)
> 
> My tentative solution, which works fine for me in my context, was to add
> an optional HWMON hwops, so that the core hwmon can retrieve if needed the
> real sensor ID if different from the channel index (using an optional hwops
> instead of some static hwinfo var let me avoid to have to patch all the
> existent hwmon drivers that happens to just work fine as of today...but
> maybe it is not necessarily the proper final solution...)
> 
> i.e.
> 
> ----8<----
> 
> Author: Cristian Marussi <cristian.marussi@xxxxxxx>
> Date:   Fri Oct 21 17:24:04 2022 +0100
> 
>     hwmon: Add new .get_sensor_id hwops
>     
>     Add a new optional helper which can be defined to allow an hwmon chip to
>     provide the logic to map hwmon indexes to the real underlying sensor IDs.
>     
>     Signed-off-by: Cristian Marussi <cristian.marussi@xxxxxxx>
> 
> diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c
> index 4218750d5a66..45d3d5070cde 100644
> --- a/drivers/hwmon/hwmon.c
> +++ b/drivers/hwmon/hwmon.c
> @@ -213,7 +213,8 @@ static void hwmon_thermal_remove_sensor(void *data)
>         list_del(data);
>  }
>  
> -static int hwmon_thermal_add_sensor(struct device *dev, int index)
> +static int hwmon_thermal_add_sensor(struct device *dev, int index,
> 7+                                   unsigned int sensor_id)
>  {
>         struct hwmon_device *hwdev = to_hwmon_device(dev);
>         struct hwmon_thermal_data *tdata;
> @@ -227,7 +228,7 @@ static int hwmon_thermal_add_sensor(struct device *dev, int index)
>         tdata->dev = dev;
>         tdata->index = index;
>  
> -       tzd = devm_thermal_of_zone_register(dev, index, tdata,
> +       tzd = devm_thermal_of_zone_register(dev, sensor_id, tdata,
>                                             &hwmon_thermal_ops);
>         if (IS_ERR(tzd)) {
>                 if (PTR_ERR(tzd) != -ENODEV)
> @@ -264,13 +265,18 @@ static int hwmon_thermal_register_sensors(struct device *dev)
>  
>                 for (j = 0; info[i]->config[j]; j++) {
>                         int err;
> +                       unsigned int id;
>  
>                         if (!(info[i]->config[j] & HWMON_T_INPUT) ||
>                             !chip->ops->is_visible(drvdata, hwmon_temp,
>                                                    hwmon_temp_input, j))
>                                 continue;
>  
> -                       err = hwmon_thermal_add_sensor(dev, j);
> +                       id = !chip->ops->get_sensor_id ? j :
> +                               chip->ops->get_sensor_id(drvdata,
> +                                                        hwmon_temp, j);
> +
> +                       err = hwmon_thermal_add_sensor(dev, j, id);
>                         if (err)
>                                 return err;
>                 }
> diff --git a/include/linux/hwmon.h b/include/linux/hwmon.h
> index 14325f93c6b2..e5dbab83f4d1 100644
> --- a/include/linux/hwmon.h
> +++ b/include/linux/hwmon.h
> @@ -396,6 +396,9 @@ enum hwmon_intrusion_attributes {
>  struct hwmon_ops {
>         umode_t (*is_visible)(const void *drvdata, enum hwmon_sensor_types type,
>                               u32 attr, int channel);
> +       unsigned int (*get_sensor_id)(const void *drvdata,
> +                                     enum hwmon_sensor_types type,
> +                                     int channel);
>         int (*read)(struct device *dev, enum hwmon_sensor_types type,
>                     u32 attr, int channel, long *val);
>         int (*read_string)(struct device *dev, enum hwmon_sensor_types type,
> 
> 
> ----->8----
> 
> 
> ... plus obviously the related scmi-hwmon.c patch to make use of this.
> 
> Any thought ? Am I missing something ?
> (not really an expert on both subsystems really ... :P)
> 
> Thanks,
> Cristian
> 



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux