+ stop-on-cpu-lost.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled

     stop on cpu lost

has been added to the -mm tree.  Its filename is

     stop-on-cpu-lost.patch

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

------------------------------------------------------
Subject: stop on cpu lost
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>


When the application is mis-configurated at cpu hot removal, a task's
cpus_allowd can be empty.  this patch adds sysctl to stop tasks whose
cpus_allowed is empty.

I think there isn't one good answer to handle this problem and this is
depend on system management policy.  In a system, forced migration is
better than stop.  In another, stopping tasks (and killing) will meet
requirement.

Now, when a task loses all of its allowed cpus because of cpu hot removal,
it will be foreced to migrate to not-allowed cpus.

In this case, the task is not properly reconfigurated by a user before
cpu-hot-removal.  Here, the task (and system) is in a unexpeced wrong
state.  This migration is maybe one of realistic workarounds.  But
sometimes it will be harmfull.  (stealing other cpu time, making bugs in
thread controllers, do some unexpected execution...)

This patch adds sysctl "sigstop_on_cpu_lost".  When sigstop_on_cpu_lost==1,
a task which losts is cpu will be stopped by SIGSTOP.  Depends on system
management policy, mis-configurated applications are stopped.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
Cc: Ashok Raj <ashok.raj@xxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 include/linux/sysctl.h |    1 +
 kernel/sched.c         |   14 ++++++++++++++
 kernel/sysctl.c        |   14 ++++++++++++++
 3 files changed, 29 insertions(+)

diff -puN include/linux/sysctl.h~stop-on-cpu-lost include/linux/sysctl.h
--- a/include/linux/sysctl.h~stop-on-cpu-lost
+++ a/include/linux/sysctl.h
@@ -151,6 +151,7 @@ enum
 	KERN_COMPAT_LOG=73,	/* int: print compat layer  messages */
 	KERN_NMI_WATCHDOG=74, /* int: enable/disable nmi watchdog */
 	KERN_PANIC_ON_NMI=75, /* int: whether we will panic on an unrecovered */
+	KERN_STOP_ON_CPU_LOST=76, /* int: SIGSTOP when a task losts its cpus */
 };
 
 
diff -puN kernel/sched.c~stop-on-cpu-lost kernel/sched.c
--- a/kernel/sched.c~stop-on-cpu-lost
+++ a/kernel/sched.c
@@ -4570,11 +4570,13 @@ wait_to_die:
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
+int sigstop_on_cpu_lost;
 /* Figure out where task on dead CPU should go, use force if neccessary. */
 static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
 {
 	int dest_cpu;
 	cpumask_t mask;
+	int force = 0;
 
 	/* On same node? */
 	mask = node_to_cpumask(cpu_to_node(dead_cpu));
@@ -4599,8 +4601,20 @@ static void move_task_off_dead_cpu(int d
 			printk(KERN_INFO "process %d (%s) no "
 			       "longer affine to cpu%d\n",
 			       tsk->pid, tsk->comm, dead_cpu);
+		/*
+		 * This thread is not properly reconfigurated before cpu hot
+		 * remove. This means this process is in the wrong state now.
+		 * If system management policy doesn't allow mis-configurated
+		 * applications, this process should be stopped.
+		 */
+		if (tsk->mm && sigstop_on_cpu_lost)
+			force = 1;
 	}
 	__migrate_task(tsk, dead_cpu, dest_cpu);
+
+	if (force) {
+		force_sig_specific(SIGSTOP, tsk);
+	}
 }
 
 /*
diff -puN kernel/sysctl.c~stop-on-cpu-lost kernel/sysctl.c
--- a/kernel/sysctl.c~stop-on-cpu-lost
+++ a/kernel/sysctl.c
@@ -130,6 +130,10 @@ extern int sysctl_hz_timer;
 extern int acct_parm[];
 #endif
 
+#ifdef CONFIG_HOTPLUG_CPU
+extern int sigstop_on_cpu_lost;
+#endif
+
 #ifdef CONFIG_IA64
 extern int no_unaligned_warning;
 #endif
@@ -705,6 +709,16 @@ static ctl_table kern_table[] = {
 		.proc_handler	= &proc_dointvec,
 	},
 #endif
+#ifdef CONFIG_HOTPLUG_CPU
+	{
+		.ctl_name	= KERN_STOP_ON_CPU_LOST,
+		.procname	= "sigstop_on_cpu_lost",
+		.data		= &sigstop_on_cpu_lost,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+#endif
 	{ .ctl_name = 0 }
 };
 
_

Patches currently in -mm which might be from kamezawa.hiroyu@xxxxxxxxxxxxxx are

for_each_possible_cpu-xfs.patch
git-acpi.patch
acpi-memory-hotplug-cannot-manage-_crs-with-plural-resoureces.patch
zone-handle-unaligned-zone-boundaries.patch
wait_table-and-zonelist-initializing-for-memory-hotadd-change-name-of-wait_table_size.patch
wait_table-and-zonelist-initializing-for-memory-hotaddadd-return-code-for-init_current_empty_zone.patch
wait_table-and-zonelist-initializing-for-memory-hotadd-wait_table-initialization.patch
wait_table-and-zonelist-initializing-for-memory-hotadd-update-zonelists.patch
support-for-panic-at-oom.patch
pgdat-allocation-for-new-node-add-generic-alloc-node_data.patch
pgdat-allocation-for-new-node-add-refresh-node_data.patch
pgdat-allocation-for-new-node-add-export-kswapd-start-func.patch
pgdat-allocation-for-new-node-add-export-kswapd-start-func-fix.patch
pgdat-allocation-for-new-node-add-call-pgdat-allocation.patch
register-hot-added-memory-to-iomem-resource.patch
catch-valid-mem-range-at-onlining-memory.patch
node-hotplug-register-cpu-remove-node-struct.patch
node-hotplug-register-cpu-remove-node-struct-alpha-fix.patch
initialise-total_memory-earlier.patch
update-vm_total_pages-at-memory-hotadd.patch
page-migration-simplify-migrate_pages.patch
page-migration-simplify-migrate_pages-tweaks.patch
page-migration-handle-freeing-of-pages-in-migrate_pages.patch
page-migration-use-allocator-function-for-migrate_pages.patch
page-migration-support-moving-of-individual-pages.patch
page-migration-detailed-status-for-moving-of-individual-pages.patch
remove-empty-node-at-boot-time.patch
stop-on-cpu-lost.patch
namespaces-utsname-sysctl-hack-cleanup-2-fix.patch
reiser4-hardirq-include-fix.patch
genirq-rename-desc-handler-to-desc-chip-ia64-fix-2.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux