Re: [PATCH V4 05/12] trace/hwlat: Support hotplug operations

Steven Rostedt <rostedt@xxxxxxxxxxx> · Mon, 21 Jun 2021 13:46:36 -0400

On Mon, 21 Jun 2021 18:14:36 +0200
Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> wrote:

> >> Yep! I tried to take the trace_type_lock here, and got the lockdep info about
> >> this problem.
> >>  
> >>> The only thing I could think of is to wake up a worker thread to do the
> >>> work. That is, this just wakes the worker thread, then the worker grabs
> >>> the trace_types_lock, iterates through the cpu mask of expect running
> >>> threads, and then starts or kills them depending on the hwlat_busy
> >>> value.    
> >> So, it will not wait for the kworker to run?  
> > What wont wait?  
> 
> For example, at the shutdown, should the hotplug callback wait for the workqueue
> to run & kill the thread, or not?

Doing that won't help the deadlock situation.

	CPU 1			CPU 2
	-----			-----
    Start shutdown
    down online_cpus()

			   mutex_lock(trace_types_lock);
			   get_online_cpus()
			   [BLOCK]

    wake_up_thread;
    [schedule worker]

    mutex_lock(trace_types_lock);

 [ DEADLOCK ]

Make all access to save_cpumask and hwlat_per_cpu_data inside the
get_online_cpus() protection. (like in move_to_next_cpu(),
start_single_thread() expand the get_online_cpus()).

Then in the cpu going down case, we can simply kill the thread and
update the save_cpumask, as it will be protected by the
get_online_cpus() code.

That is, don't even check if hwlat_busy is set or not. Just simply do:

CPU_DOWN:

	stop_cpu_kthead(cpu);

That will stop the kthread if it is running.  But we should update
that function to also set per_cpu(hwlat_per_cpu_data).kthread = NULL;
Like stop_single_kthread() does.

But for CPU_UP, we should do the work via a worker thread.

CPU_UP:
	schedule_work_on(&update_kthreads, cpu);

Which in the work function for that update_kthreads work queue:

	mutex_lock(&trace_types_lock);
	if (!hwlat_busy || hwlat_data.thread_mode != MODE_PER_CPU)
		goto out_unlock;

	get_online_cpus();
	if (!this_cpu(hwlat_per_cpu_data).kthread)
		start_per_cpu_kthread(smp_processor_id());
	put_online_cpus();

 out_unlock:
	mutex_unlock(&trace_types_lock);

Or something like that.

-- Steve