+ cpuset-fix-obscure-attach_task-vs-exiting-race.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled

     cpuset: fix obscure attach_task vs exiting race

has been added to the -mm tree.  Its filename is

     cpuset-fix-obscure-attach_task-vs-exiting-race.patch

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

------------------------------------------------------
Subject: cpuset: fix obscure attach_task vs exiting race
From: Paul Jackson <pj@xxxxxxx>

Fix obscure race condition in kernel/cpuset.c attach_task() code.

There is basically zero chance of anyone accidentally being harmed by this
race.

It requires a special 'micro-stress' load and a special timing loop hacks
in the kernel to hit in less than an hour, and even then you'd have to hit
it hundreds or thousands of times, followed by some unusual and senseless
cpuset configuration requests, including removing the top cpuset, to cause
any visibly harm affects.

One could, with perhaps a few days or weeks of such effort, get the
reference count on the top cpuset below zero, and manage to crash the
kernel by asking to remove the top cpuset.

I found it by code inspection.

The race was introduced when 'the_top_cpuset_hack' was introduced, and one
piece of code was not updated.  An old check for a possibly null task
cpuset pointer needed to be changed to a check for a task marked
PF_EXITING.  The pointer can't be null anymore, thanks to
the_top_cpuset_hack (documented in kernel/cpuset.c).  But the task could
have gone into PF_EXITING state after it was found in the task_list scan.

If a task is PF_EXITING in this code, it is possible that its task->cpuset
pointer is pointing to the top cpuset due to the_top_cpuset_hack, rather
than because the top_cpuset was that tasks last valid cpuset.  In that
case, the wrong cpuset reference counter would be decremented.

The fix is trivial.  Instead of failing the system call if the tasks cpuset
pointer is null here, fail it if the task is in PF_EXITING state.

The code for 'the_top_cpuset_hack' that changes an exiting tasks cpuset to
the top_cpuset is done without locking, so could happen at anytime.  But it
is done during the exit handling, after the PF_EXITING flag is set.  So if
we verify that a task is still not PF_EXITING after we copy out its cpuset
pointer (into 'oldcs', below), we know that 'oldcs' is not one of these
hack references to the top_cpuset.

Signed-off-by: Paul Jackson <pj@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 kernel/cpuset.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletion(-)

diff -puN kernel/cpuset.c~cpuset-fix-obscure-attach_task-vs-exiting-race kernel/cpuset.c
--- a/kernel/cpuset.c~cpuset-fix-obscure-attach_task-vs-exiting-race
+++ a/kernel/cpuset.c
@@ -1225,7 +1225,12 @@ static int attach_task(struct cpuset *cs
 
 	task_lock(tsk);
 	oldcs = tsk->cpuset;
-	if (!oldcs) {
+	/*
+	 * After getting 'oldcs' cpuset ptr, be sure still not exiting.
+	 * If 'oldcs' might be the top_cpuset due to the_top_cpuset_hack
+	 * then fail this attach_task(), to avoid breaking top_cpuset.count.
+	 */
+	if (tsk->flags & PF_EXITING) {
 		task_unlock(tsk);
 		mutex_unlock(&callback_mutex);
 		put_task_struct(tsk);
_

Patches currently in -mm which might be from pj@xxxxxxx are

apply-type-enum-zone_type-fix.patch
cpu-hotplug-compatible-alloc_percpu-fix.patch
cpu-hotplug-compatible-alloc_percpu-fix-2.patch
oom-cpuset-hint.patch
account-for-memmap-and-optionally-the-kernel-image-as-holes-fix.patch
cpuset-top_cpuset-tracks-hotplug-changes-to-node_online_map.patch
cpuset-top_cpuset-tracks-hotplug-changes-to-node_online_map-fix.patch
cpuset-top_cpuset-tracks-hotplug-changes-to-node_online_map-fix-2.patch
cpuset-hotunplug-cpus-and-mems-in-all-cpusets.patch
cpuset-fix-obscure-attach_task-vs-exiting-race.patch
sched-force-sbin-init-off-isolated-cpus.patch
sched-generic-sched_group-cpu-power-setup.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux