On 2/21/22 21:22, Guenter Roeck wrote: > If an attempt is made to a sensor with a thermal zone and it fails, > the call to devm_thermal_zone_of_sensor_register() may return -ENODEV. > This may result in crashes similar to the following. > > Unable to handle kernel NULL pointer dereference at virtual address 00000000000003cd > ... > Internal error: Oops: 96000021 [#1] PREEMPT SMP > ... > pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : mutex_lock+0x18/0x60 > lr : thermal_zone_device_update+0x40/0x2e0 > sp : ffff800014c4fc60 > x29: ffff800014c4fc60 x28: ffff365ee3f6e000 x27: ffffdde218426790 > x26: ffff365ee3f6e000 x25: 0000000000000000 x24: ffff365ee3f6e000 > x23: ffffdde218426870 x22: ffff365ee3f6e000 x21: 00000000000003cd > x20: ffff365ee8bf3308 x19: ffffffffffffffed x18: 0000000000000000 > x17: ffffdde21842689c x16: ffffdde1cb7a0b7c x15: 0000000000000040 > x14: ffffdde21a4889a0 x13: 0000000000000228 x12: 0000000000000000 > x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 > x8 : 0000000001120000 x7 : 0000000000000001 x6 : 0000000000000000 > x5 : 0068000878e20f07 x4 : 0000000000000000 x3 : 00000000000003cd > x2 : ffff365ee3f6e000 x1 : 0000000000000000 x0 : 00000000000003cd > Call trace: > mutex_lock+0x18/0x60 > hwmon_notify_event+0xfc/0x110 > 0xffffdde1cb7a0a90 > 0xffffdde1cb7a0b7c > irq_thread_fn+0x2c/0xa0 > irq_thread+0x134/0x240 > kthread+0x178/0x190 > ret_from_fork+0x10/0x20 > Code: d503201f d503201f d2800001 aa0103e4 (c8e47c02) > > Jon Hunter reports that the exact call sequence is: > > hwmon_notify_event() > --> hwmon_thermal_notify() > --> thermal_zone_device_update() > --> update_temperature() > --> mutex_lock() > > The hwmon core needs to handle all errors returned from calls > to devm_thermal_zone_of_sensor_register(). If the call fails > with -ENODEV, report that the sensor was not attached to a > thermal zone but continue to register the hwmon device. > > Reported-by: Jon Hunter <jonathanh@xxxxxxxxxx> > Cc: Dmitry Osipenko <digetx@xxxxxxxxx> > Fixes: 1597b374af222 ("hwmon: Add notification support") > Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx> > --- > v2: Use dev_info instead of dev_warn, and change message to be > less alarming. > > drivers/hwmon/hwmon.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c > index 3501a3ead4ba..3ae961986fc3 100644 > --- a/drivers/hwmon/hwmon.c > +++ b/drivers/hwmon/hwmon.c > @@ -214,12 +214,14 @@ static int hwmon_thermal_add_sensor(struct device *dev, int index) > > tzd = devm_thermal_zone_of_sensor_register(dev, index, tdata, > &hwmon_thermal_ops); > - /* > - * If CONFIG_THERMAL_OF is disabled, this returns -ENODEV, > - * so ignore that error but forward any other error. > - */ > - if (IS_ERR(tzd) && (PTR_ERR(tzd) != -ENODEV)) > - return PTR_ERR(tzd); > + if (IS_ERR(tzd)) { > + if (PTR_ERR(tzd) != -ENODEV) > + return PTR_ERR(tzd); > + dev_info(dev, "temp%d_input not attached to any thermal zone\n", > + index + 1); > + devm_kfree(dev, tdata); > + return 0; > + } > > err = devm_add_action(dev, hwmon_thermal_remove_sensor, &tdata->node); > if (err) LGTM, thank you. But still needs a t-b from Jon. Reviewed-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx>