On Mon, 21 Jun 2021 18:14:36 +0200 Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> wrote: > >> Yep! I tried to take the trace_type_lock here, and got the lockdep info about > >> this problem. > >> > >>> The only thing I could think of is to wake up a worker thread to do the > >>> work. That is, this just wakes the worker thread, then the worker grabs > >>> the trace_types_lock, iterates through the cpu mask of expect running > >>> threads, and then starts or kills them depending on the hwlat_busy > >>> value. > >> So, it will not wait for the kworker to run? > > What wont wait? > > For example, at the shutdown, should the hotplug callback wait for the workqueue > to run & kill the thread, or not? Doing that won't help the deadlock situation. CPU 1 CPU 2 ----- ----- Start shutdown down online_cpus() mutex_lock(trace_types_lock); get_online_cpus() [BLOCK] wake_up_thread; [schedule worker] mutex_lock(trace_types_lock); [ DEADLOCK ] Make all access to save_cpumask and hwlat_per_cpu_data inside the get_online_cpus() protection. (like in move_to_next_cpu(), start_single_thread() expand the get_online_cpus()). Then in the cpu going down case, we can simply kill the thread and update the save_cpumask, as it will be protected by the get_online_cpus() code. That is, don't even check if hwlat_busy is set or not. Just simply do: CPU_DOWN: stop_cpu_kthead(cpu); That will stop the kthread if it is running. But we should update that function to also set per_cpu(hwlat_per_cpu_data).kthread = NULL; Like stop_single_kthread() does. But for CPU_UP, we should do the work via a worker thread. CPU_UP: schedule_work_on(&update_kthreads, cpu); Which in the work function for that update_kthreads work queue: mutex_lock(&trace_types_lock); if (!hwlat_busy || hwlat_data.thread_mode != MODE_PER_CPU) goto out_unlock; get_online_cpus(); if (!this_cpu(hwlat_per_cpu_data).kthread) start_per_cpu_kthread(smp_processor_id()); put_online_cpus(); out_unlock: mutex_unlock(&trace_types_lock); Or something like that. -- Steve