On 5/25/2016 12:29 PM, Jon Hunter wrote: > > On 25/05/16 17:10, Jon Hunter wrote: > > ... > >> So power_supply_read_temp() calls ->get_property() and passes the >> power_supply psy struct which is initialised. The problem is that inside >> the bq27xxx driver, this then kicks off the worker thread to update the >> bq27xxx state and when this worker thread runs it attempts to access the >> same psy struct but by dereferencing a pointer to it from the >> bq27xxx_device_info where the pointer has not been initialised yet. >> Therefore, IMO it seems that we should not allow this worker thread to >> start until the registration has completed and hence the pointer is >> initialised. > > Sorry, it is not the actual worker thread that triggers the NULL pointer > deference, but the function bq27xxx_battery_poll() that schedules the > worker thread. Anyway, I still don't see that we need to update the > bq27xxx state during the registration especially seeing as we call > bq27xxx_battery_update() after the registration is complete. It seems > that updating the overall state should be mutually exclusive from > reading the temp. > > Looking at my patch, it does appear that the worker thread which also > calls bq27xxx_battery_update() is still scheduled and so may be it > should be ... > > diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..1334ed522332 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) > container_of(work, struct bq27xxx_device_info, > work.work); > > + if (!di->bat) > + return; > + > bq27xxx_battery_update(di); > > if (poll_interval > 0) { > > I can see that getting the temperature could work. I would point out that I don't see any recent changes to bq27xxx or the power_supply_core that would imply this is a regression. My guess is that up until now, for devices that support the TEMP property, CONFIG_THERMAL isn't been enabled. So here are my thoughts.... we can do 2 things here: 1) patch bq27xxx in some manner that will allow the bq27xxx driver to work report the temp during register (such as above patch). 2) Patch the core to avoid using get_property callback during registration. I think for our immediate concern and crash, #1 is fine. It will work and is fine. I however think this is just a symptom of the larger issue (#2). In this case, the problem we see is that di->bat is used before it is set, and we have a way around it. However, for EVERY device that registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to receive a call with its relative di->bat uninitialized too. I don't know for certain if #2 has caused problems anywhere else, and I would be surprised if it has and hasn't been caught. AS far as this crash is concerned, I think either approach will work. Adding in David, Dmitry, and Sebastian (maintainers) to see if they have a preferred approach. -rhyland -- nvpublic -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html