Hi Marek,
On 05/10/2022 15:05, Marek Szyprowski wrote:
On 05.10.2022 14:37, Daniel Lezcano wrote:
Hi Marek,
On 03/10/2022 23:18, Daniel Lezcano wrote:
[ ... ]
I've tested this v8 patchset after fixing the issue with Exynos TMU
with
https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@xxxxxxxxxx/
patch and I got the following lockdep warning on all Exynos-based
boards:
======================================================
WARNING: possible circular locking dependency detected
6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8
but task is already holding lock:
c2979b94 (&tz->lock){+.+.}-{3:3}, at:
thermal_zone_device_update.part.0+0x3c/0x528
which lock already depends on the new lock.
I'm wondering if the problem is not already there and related to
data->lock ...
Doesn't the thermal zone lock already prevent racy access to the data
structure?
Another question: if the sensor clock is disabled after reading it,
how does the hardware update the temperature and detect the programed
threshold is crossed?
just a gentle ping, as the fix will depend on your answer ;)
Sorry, I've been busy with other stuff. I thought I will fix this once I
find a bit of spare time.
Ok, that is great if you can find time to fix it up because I've other
drivers to convert to the generic thermal trips.
IMHO the clock management is a bit over-engineered, as there is little
(if any) benefit from such fine grade clock management. That clock is
needed only for the AHB related part of the TMU (reading/writing the
registers). The IRQ generation and temperature measurement is clocked
from so called 'sclk' (special clock).
I also briefly looked at the code and the internal lock doesn't look to
be really necessary assuming that the thermal core already serializes
all the calls.
I looked at the code and I think the driver can be simplified (fixed?)
even more.
IIUC, the sensor has multiple trip point interrupts, so if the device
tree is describing more trip points than the sensor supports, there is a
warning and the number of trip point is capped.
IMO that can be simplified by using two trip point interrupt because the
thermal_zone_device_update() will call the set_trips callback with the
new boundaries. IOW, the thermal framework sets a new trip point
interrupt when one is crossed.
That should result in the simplification of the tmu_control as well as
the tmu_probe function. As well as removing the limitation of the number
of trip points.
In order to have that correctly working, the 'set_trips' ops must be
used to call the tmu_control callback instead of calling it in tmu_probe.
The intialization workflow should be:
probe->...
->thermal_zone_device_register()
->thermal_zone_device_update()
->update_trip_points()
->ops->set_trips()
->tmu_control()
Also, replace the workqueue by a threaded interrupt.
Does it make sense?
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog