On 02.04.2018 13:56, Viresh Kumar wrote: > This extends the sysfs interface for thermal cooling devices and exposes > some pretty useful statistics. These statistics have proven to be quite > useful specially while doing benchmarks related to the task scheduler, > where we want to make sure that nothing has disrupted the test, > specially the cooling device which may have put constraints on the CPUs. > The information exposed here tells us to what extent the CPUs were > constrained by the thermal framework. > > The write-only "reset" file is used to reset the statistics. > > The read-only "time_in_state_ms" file shows the time (in msec) spent by the > device in the respective cooling states, and it prints one line per > cooling state. > > The read-only "total_trans" file shows single positive integer value > showing the total number of cooling state transitions the device has > gone through since the time the cooling device is registered or the time > when statistics were reset last. > > The read-only "trans_table" file shows a two dimensional matrix, where > an entry <i,j> (row i, column j) represents the number of transitions > from State_i to State_j. > > This is how the directory structure looks like for a single cooling > device: > > $ ls -R /sys/class/thermal/cooling_device0/ > /sys/class/thermal/cooling_device0/: > cur_state max_state power stats subsystem type uevent > > /sys/class/thermal/cooling_device0/power: > autosuspend_delay_ms runtime_active_time runtime_suspended_time > control runtime_status > > /sys/class/thermal/cooling_device0/stats: > reset time_in_state_ms total_trans trans_table > > This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and > ARM 64-bit Hisilicon hikey960 board running Android. > > Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx> > --- Hello, I'm working on adding support of OPP and cooling for NVIDIA Tegra20/30 CPUFreq driver and stumbled upon a bug that is introduced by this patch. It is triggered on the driver module unload. diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 6ab982309e6a..de53c821a282 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -1102,8 +1102,8 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) mutex_unlock(&thermal_list_lock); ida_simple_remove(&thermal_cdev_ida, cdev->id); - device_unregister(&cdev->device); thermal_cooling_device_destroy_sysfs(cdev); + device_unregister(&cdev->device); } EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister); This patch fixes the issue with the "cooling_device", but I'm not sure that it won't break thermal_zone". Also see KASAN report below. [ 65.553469] ================================================================== [ 65.572514] BUG: KASAN: use-after-free in thermal_cooling_device_destroy_sysfs+0x24/0x40 [ 65.592300] Read of size 4 at addr ced17c80 by task rmmod/206 [ 65.632387] CPU: 1 PID: 206 Comm: rmmod Not tainted 4.18.0-rc8-next-20180810-00148-g2863c2b33049-dirty #361 [ 65.654241] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) [ 65.676552] [<c0116784>] (unwind_backtrace) from [<c010fd54>] (show_stack+0x20/0x24) [ 65.699719] [<c010fd54>] (show_stack) from [<c10861b4>] (dump_stack+0x9c/0xb0) [ 65.723224] [<c10861b4>] (dump_stack) from [<c03012ac>] (print_address_description+0x60/0x268) [ 65.747525] [<c03012ac>] (print_address_description) from [<c03018c8>] (kasan_report+0x120/0x388) [ 65.771873] [<c03018c8>] (kasan_report) from [<c02fff44>] (__asan_load4+0x64/0xb4) [ 65.796324] [<c02fff44>] (__asan_load4) from [<c0b76d00>] (thermal_cooling_device_destroy_sysfs+0x24/0x40) [ 65.820990] [<c0b76d00>] (thermal_cooling_device_destroy_sysfs) from [<c0b73804>] (thermal_cooling_device_unregister+0x130/0x238) [ 65.846039] [<c0b73804>] (thermal_cooling_device_unregister) from [<c0b7a26c>] (cpufreq_cooling_unregister+0xa8/0xfc) [ 65.870897] [<c0b7a26c>] (cpufreq_cooling_unregister) from [<bf0003c0>] (tegra_cpu_exit+0x2c/0x74 [tegra20_cpufreq]) [ 65.895940] [<bf0003c0>] (tegra_cpu_exit [tegra20_cpufreq]) from [<c0b83fa4>] (cpufreq_offline+0x160/0x298) [ 65.920899] [<c0b83fa4>] (cpufreq_offline) from [<c0b841cc>] (cpufreq_remove_dev+0xd0/0xd4) [ 65.945804] [<c0b841cc>] (cpufreq_remove_dev) from [<c0867c90>] (subsys_interface_unregister+0xe4/0x130) [ 65.971622] [<c0867c90>] (subsys_interface_unregister) from [<c0b823f0>] (cpufreq_unregister_driver+0x44/0x8c) [ 65.998135] [<c0b823f0>] (cpufreq_unregister_driver) from [<bf00002c>] (tegra20_cpufreq_remove+0x2c/0x34 [tegra20_cpufreq]) [ 66.025805] [<bf00002c>] (tegra20_cpufreq_remove [tegra20_cpufreq]) from [<c086cde4>] (platform_drv_remove+0x44/0x64) [ 66.053768] [<c086cde4>] (platform_drv_remove) from [<c086a93c>] (device_release_driver_internal+0x1f0/0x2e0) [ 66.081707] [<c086a93c>] (device_release_driver_internal) from [<c086aab8>] (driver_detach+0x68/0xb8) [ 66.110346] [<c086aab8>] (driver_detach) from [<c0869128>] (bus_remove_driver+0x84/0xfc) [ 66.139530] [<c0869128>] (bus_remove_driver) from [<c086b898>] (driver_unregister+0x4c/0x6c) [ 66.169514] [<c086b898>] (driver_unregister) from [<c086cee8>] (platform_driver_unregister+0x1c/0x20) [ 66.200091] [<c086cee8>] (platform_driver_unregister) from [<bf000980>] (tegra20_cpufreq_driver_exit+0x18/0x698 [tegra20_cpufreq]) [ 66.232017] [<bf000980>] (tegra20_cpufreq_driver_exit [tegra20_cpufreq]) from [<c01ff02c>] (sys_delete_module+0x198/0x224) [ 66.264804] [<c01ff02c>] (sys_delete_module) from [<c0101000>] (ret_fast_syscall+0x0/0x58) [ 66.298137] Exception stack(0xce94bfa8 to 0xce94bff0) [ 66.331825] bfa0: 0003f0d0 00000002 0003f10c 00000800 5e6a7500 5e6a7500 [ 66.366665] bfc0: 0003f0d0 00000002 0003f0d0 00000081 b6a723d0 b6a7207c b6a7226c 00000001 [ 66.401864] bfe0: aec42610 b6a72014 00022408 aec4261c [ 66.472603] Allocated by task 151: [ 66.508377] kasan_kmalloc+0xd4/0x174 [ 66.544570] kmem_cache_alloc_trace+0x198/0x2e8 [ 66.581197] __thermal_cooling_device_register+0x9c/0x4c0 [ 66.618085] thermal_of_cooling_device_register+0x18/0x1c [ 66.655387] __cpufreq_cooling_register+0x4c4/0x604 [ 66.692976] of_cpufreq_cooling_register+0x88/0xe8 [ 66.730726] tegra_cpu_ready+0x28/0x3c [tegra20_cpufreq] [ 66.768872] cpufreq_online+0x798/0x8d0 [ 66.807262] cpufreq_add_dev+0xa0/0xac [ 66.845892] subsys_interface_register+0x104/0x148 [ 66.884167] cpufreq_register_driver+0x1d0/0x264 [ 66.922070] tegra20_cpufreq_probe+0x1f8/0x27c [tegra20_cpufreq] [ 66.959803] platform_drv_probe+0x70/0xc8 [ 66.997149] really_probe+0x284/0x3d4 [ 67.034006] driver_probe_device+0x80/0x1b8 [ 67.070515] __driver_attach+0x130/0x134 [ 67.106447] bus_for_each_dev+0x98/0xc4 [ 67.141867] driver_attach+0x38/0x3c [ 67.177010] bus_add_driver+0x238/0x2cc [ 67.211717] driver_register+0xdc/0x1b0 [ 67.245684] __platform_driver_register+0x7c/0x84 [ 67.279456] 0xbf005028 [ 67.312693] do_one_initcall+0x60/0x344 [ 67.345795] do_init_module+0xe4/0x30c [ 67.378294] load_module+0x3008/0x3784 [ 67.410423] sys_finit_module+0xac/0xc4 [ 67.442102] ret_fast_syscall+0x0/0x58 [ 67.472788] 0xb6781c10 [ 67.531724] Freed by task 206: [ 67.560135] __kasan_slab_free+0x12c/0x204 [ 67.587993] kasan_slab_free+0x14/0x18 [ 67.615343] kfree+0x90/0x294 [ 67.642143] thermal_release+0x6c/0x98 [ 67.668309] device_release+0x4c/0xe8 [ 67.693667] kobject_put+0xac/0x11c [ 67.718166] device_unregister+0x2c/0x30 [ 67.742308] thermal_cooling_device_unregister+0x128/0x238 [ 67.766189] cpufreq_cooling_unregister+0xa8/0xfc [ 67.789630] tegra_cpu_exit+0x2c/0x74 [tegra20_cpufreq] [ 67.812973] cpufreq_offline+0x160/0x298 [ 67.835506] cpufreq_remove_dev+0xd0/0xd4 [ 67.857115] subsys_interface_unregister+0xe4/0x130 [ 67.878280] cpufreq_unregister_driver+0x44/0x8c [ 67.899235] tegra20_cpufreq_remove+0x2c/0x34 [tegra20_cpufreq] [ 67.919948] platform_drv_remove+0x44/0x64 [ 67.940467] device_release_driver_internal+0x1f0/0x2e0 [ 67.960895] driver_detach+0x68/0xb8 [ 67.981161] bus_remove_driver+0x84/0xfc [ 68.001382] driver_unregister+0x4c/0x6c [ 68.021561] platform_driver_unregister+0x1c/0x20 [ 68.041879] tegra20_cpufreq_driver_exit+0x18/0x698 [tegra20_cpufreq] [ 68.062376] sys_delete_module+0x198/0x224 [ 68.082826] ret_fast_syscall+0x0/0x58 [ 68.103010] 0xb6a72014 -- Dmitry