Re: [PATCH v2] sched/core: fix illegal RCU from offline CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 20, 2020 at 03:35:09PM -0500, Qian Cai wrote:
> > On Jan 20, 2020, at 5:17 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > 
> > Bah.. that's horrible. Surely we can find a better place to do this in
> > the whole hotplug machinery.
> > 
> > Perhaps you can have takedown_cpu() do the mmdrop()?
> 
> The problem is that no all arch_cpu_idle_dead() will call
> idle_task_exit(). For example, alpha and parisc are not, so it needs

How is that not broken? If the idle thread runs on an active_mm, we need
to drop that reference. This needs fixing regardless.

> to deal with some kind of ifdef dance in takedown_cpu() to
> conditionally call mmdrop() which sounds even more horrible?
> 
> If you really prefer it anyway, maybe something like touching every arch’s __cpu_die() to also call mmdrop() depends on arches?
> 
> BTW, how to obtain the other CPU’s current task mm? Is that cpu_rq(cpu)->curr->active_mm?

Something like this; except you'll need to go audit archs to make sure
they all call idle_task_exit() and/or put in comments on why they don't
have to (perhaps their bringup switches them to &init_mm unconditionally
and the switch_mm() is not required).

---
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9c706af713fb..2b4d8a69e8d9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -564,6 +564,23 @@ static int bringup_cpu(unsigned int cpu)
 	return bringup_wait_for_ap(cpu);
 }
 
+static int finish_cpu(unsigned int cpu)
+{
+	struct task_struct *idle = idle_thread_get(cpu);
+	struct mm_struct *mm = idle->active_mm;
+
+	/*
+	 * idle_task_exit() will have switched to &init_mm, now
+	 * clean up any remaining active_mm state.
+	 */
+
+	if (mm == &init_mm)
+		return;
+
+	idle->active_mm = &init_mm;
+	mmdrop(mm);
+}
+
 /*
  * Hotplug state machine related functions
  */
@@ -1434,7 +1451,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
 	[CPUHP_BRINGUP_CPU] = {
 		.name			= "cpu:bringup",
 		.startup.single		= bringup_cpu,
-		.teardown.single	= NULL,
+		.teardown.single	= finish_cpu,
 		.cant_stop		= true,
 	},
 	/* Final state before CPU kills itself */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fc1dfc007604..8f049fb77a3d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6188,13 +6188,14 @@ void idle_task_exit(void)
 	struct mm_struct *mm = current->active_mm;
 
 	BUG_ON(cpu_online(smp_processor_id()));
+	BUG_ON(current != this_rq()->idle);
 
 	if (mm != &init_mm) {
 		switch_mm(mm, &init_mm, current);
-		current->active_mm = &init_mm;
 		finish_arch_post_lock_switch();
 	}
-	mmdrop(mm);
+
+	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 
 /*





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux