Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 07, 2017 at 07:26:23PM +0530, Prateek Sood wrote:
> Remove circular dependency deadlock in a scenario where hotplug of CPU is
> being done while there is updation in cgroup and cpuset triggered from
> userspace.
> 
> Process A => kthreadd => Process B => Process C => Process A

> Process A
> cpu_subsys_offline();
>   cpu_down();
>     _cpu_down();
>       percpu_down_write(&cpu_hotplug_lock); //held
>       cpuhp_invoke_callback();
>         workqueue_offline_cpu();
>           wq_update_unbound_numa();
>             kthread_create_on_node();
>               wake_up_process();  //wakeup kthreadd

TJ, I'm puzzled, why would we need to spawn new threads to update NUMA
affinity when taking a CPU out? That doesn't make sense to me, we can
either shrink the affinity of an existing thread or completely kill of a
thread if the mask becomes empty. But why spawn a new thread?

>           flush_work();
>           wait_for_completion();
> 
> kthreadd
> kthreadd();
>   kernel_thread();
>     do_fork();
>       copy_process();
>         percpu_down_read(&cgroup_threadgroup_rwsem);
>           __rwsem_down_read_failed_common(); //waiting

So this will eventually do our:

		complete() to make A go.


> Process B
> kernfs_fop_write();
>   cgroup_file_write();
>     cgroup_procs_write();
>       percpu_down_write(&cgroup_threadgroup_rwsem); //held
>       cgroup_attach_task();
>         cgroup_migrate();
>           cgroup_migrate_execute();
>             cpuset_can_attach();
>               mutex_lock(&cpuset_mutex); //waiting
> 
> Process C
> kernfs_fop_write();
>   cgroup_file_write();
>     cpuset_write_resmask();
>       mutex_lock(&cpuset_mutex); //held
>       update_cpumask();
>         update_cpumasks_hier();
>           rebuild_sched_domains_locked();
>             get_online_cpus();
>               percpu_down_read(&cpu_hotplug_lock); //waiting


So the whole thing looks like:


A		B			C		D

L(hotplug)
		L(threadgroup)
					L(cpuset)
							L(threadgroup)
WFC(c)
		L(cpuset)
					L(hotplug)
							C(c)

Yes, inverting cpuset and hotplug would break that chain, but I'm still
wondering why workqueue needs to spawn threads on CPU down.
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux