From: "Paul E. McKenney" <paulmck@xxxxxxxxxx> The advent of commit 06249738a41a ("workqueue: Manually break affinity on hotplug") means that the scheduler no longer silently breaks affinity for kthreads pinned to the outgoing CPU. This can happen for many of rcutorture's kthreads due to shuffling, which periodically affinities these ktheads away from a randomly chosen CPU. This usually works fine because these kthreads are allowed to run on any other CPU and because shuffling is a no-op any time there is but one online CPU. However, consider the following sequence of events: 1. CPUs 0 and 1 are initially online. 2. The torture_shuffle_tasks() function affinities all the tasks away from CPU 0. 3. CPU 1 goes offline. 4. All the tasks are now affinitied to an offline CPU, triggering the warning added by the commit noted above. This can trigger the following in sched_cpu_dying() in kernel/sched/core.c: BUG_ON(rq->nr_running != 1 || rq_has_pinned_tasks(rq)) This commit therefore adds a new torture_shuffle_tasks_offline() function that is invoked from torture_offline() prior to offlining a CPU. This new function scans the list of shuffled kthreads and for any thread that last ran (or is set to run) on the outgoing CPU, sets its affinity to all online CPUs. Thus there will never be a kthread that is affinitied only to the outgoing CPU. Of course, if the sysadm manually applies affinity to any of these kthreads, all bets are off. However, such a sysadm must be fast because the torture_shuffle_tasks_offline() function is invoked immediately before offlining the outgoing CPU. Therefore, let it be known that with great speed and great power comes great responsibility. Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> --- kernel/torture.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/kernel/torture.c b/kernel/torture.c index 01e336f..40c5c68 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -155,6 +155,8 @@ EXPORT_SYMBOL_GPL(torture_hrtimeout_s); #ifdef CONFIG_HOTPLUG_CPU +static void torture_shuffle_tasks_offline(int cpu); + /* * Variables for online-offline handling. Only present if CPU hotplug * is enabled, otherwise does nothing. @@ -212,6 +214,7 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes, torture_type, cpu); starttime = jiffies; (*n_offl_attempts)++; + torture_shuffle_tasks_offline(cpu); ret = remove_cpu(cpu); if (ret) { s = ""; @@ -512,6 +515,20 @@ static void torture_shuffle_task_unregister_all(void) mutex_unlock(&shuffle_task_mutex); } +#ifdef CONFIG_HOTPLUG_CPU +// Unbind all tasks from a CPU that is to be taken offline. +static void torture_shuffle_tasks_offline(int cpu) +{ + struct shuffle_task *stp; + + mutex_lock(&shuffle_task_mutex); + list_for_each_entry(stp, &shuffle_task_list, st_l) + if (task_cpu(stp->st_t) == cpu) + set_cpus_allowed_ptr(stp->st_t, cpu_online_mask); + mutex_unlock(&shuffle_task_mutex); +} +#endif // #ifdef CONFIG_HOTPLUG_CPU + /* Shuffle tasks such that we allow shuffle_idle_cpu to become idle. * A special case is when shuffle_idle_cpu = -1, in which case we allow * the tasks to run on all CPUs. -- 2.9.5