Hi Daniel, On 06.10.2022 08:55, Daniel Lezcano wrote: > > On 05/10/2022 15:05, Marek Szyprowski wrote: >> >> On 05.10.2022 14:37, Daniel Lezcano wrote: >>> >>> On 03/10/2022 23:18, Daniel Lezcano wrote: >>> >>> [ ... ] >>> >>>>> I've tested this v8 patchset after fixing the issue with Exynos TMU >>>>> with >>>>> https://lore.kernel.org/all/20221003132943.1383065-1-daniel.lezcano@xxxxxxxxxx/ >>>>> >>>>> >>>>> patch and I got the following lockdep warning on all Exynos-based >>>>> boards: >>>>> >>>>> >>>>> ====================================================== >>>>> WARNING: possible circular locking dependency detected >>>>> 6.0.0-rc1-00083-ge5c9d117223e #12945 Not tainted >>>>> ------------------------------------------------------ >>>>> swapper/0/1 is trying to acquire lock: >>>>> c1ce66b0 (&data->lock#2){+.+.}-{3:3}, at: exynos_get_temp+0x3c/0xc8 >>>>> >>>>> but task is already holding lock: >>>>> c2979b94 (&tz->lock){+.+.}-{3:3}, at: >>>>> thermal_zone_device_update.part.0+0x3c/0x528 >>>>> >>>>> which lock already depends on the new lock. >>>> >>>> I'm wondering if the problem is not already there and related to >>>> data->lock ... >>>> >>>> Doesn't the thermal zone lock already prevent racy access to the data >>>> structure? >>>> >>>> Another question: if the sensor clock is disabled after reading it, >>>> how does the hardware update the temperature and detect the programed >>>> threshold is crossed? >>> >>> just a gentle ping, as the fix will depend on your answer ;) >>> >> Sorry, I've been busy with other stuff. I thought I will fix this once I >> find a bit of spare time. > > Ok, that is great if you can find time to fix it up because I've other > drivers to convert to the generic thermal trips. > > >> IMHO the clock management is a bit over-engineered, as there is little >> (if any) benefit from such fine grade clock management. That clock is >> needed only for the AHB related part of the TMU (reading/writing the >> registers). The IRQ generation and temperature measurement is clocked >> from so called 'sclk' (special clock). >> >> I also briefly looked at the code and the internal lock doesn't look to >> be really necessary assuming that the thermal core already serializes >> all the calls. > > I looked at the code and I think the driver can be simplified (fixed?) > even more. > > IIUC, the sensor has multiple trip point interrupts, so if the device > tree is describing more trip points than the sensor supports, there is > a warning and the number of trip point is capped. > > IMO that can be simplified by using two trip point interrupt because > the thermal_zone_device_update() will call the set_trips callback with > the new boundaries. IOW, the thermal framework sets a new trip point > interrupt when one is crossed. > > That should result in the simplification of the tmu_control as well as > the tmu_probe function. As well as removing the limitation of the > number of trip points. > > In order to have that correctly working, the 'set_trips' ops must be > used to call the tmu_control callback instead of calling it in tmu_probe. > > The intialization workflow should be: > > probe->... > ->thermal_zone_device_register() > ->thermal_zone_device_update() > ->update_trip_points() > ->ops->set_trips() > ->tmu_control() > > Also, replace the workqueue by a threaded interrupt. > > Does it make sense? Yes, definitely. Frankly speaking I've never looked into that code, so I was not aware that it needs some cleanup. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland