The patch titled stop on cpu lost has been removed from the -mm tree. Its filename is stop-on-cpu-lost.patch This patch was dropped because it was nacked by the maintainer ------------------------------------------------------ Subject: stop on cpu lost From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> When the application is mis-configurated at cpu hot removal, a task's cpus_allowd can be empty. this patch adds sysctl to stop tasks whose cpus_allowed is empty. I think there isn't one good answer to handle this problem and this is depend on system management policy. In a system, forced migration is better than stop. In another, stopping tasks (and killing) will meet requirement. Now, when a task loses all of its allowed cpus because of cpu hot removal, it will be foreced to migrate to not-allowed cpus. In this case, the task is not properly reconfigurated by a user before cpu-hot-removal. Here, the task (and system) is in a unexpeced wrong state. This migration is maybe one of realistic workarounds. But sometimes it will be harmfull. (stealing other cpu time, making bugs in thread controllers, do some unexpected execution...) This patch adds sysctl "sigstop_on_cpu_lost". When sigstop_on_cpu_lost==1, a task which losts is cpu will be stopped by SIGSTOP. Depends on system management policy, mis-configurated applications are stopped. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Cc: Ashok Raj <ashok.raj@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> --- include/linux/sysctl.h | 1 + kernel/sched.c | 14 ++++++++++++++ kernel/sysctl.c | 14 ++++++++++++++ 3 files changed, 29 insertions(+) diff -puN include/linux/sysctl.h~stop-on-cpu-lost include/linux/sysctl.h --- a/include/linux/sysctl.h~stop-on-cpu-lost +++ a/include/linux/sysctl.h @@ -151,6 +151,7 @@ enum KERN_COMPAT_LOG=73, /* int: print compat layer messages */ KERN_NMI_WATCHDOG=74, /* int: enable/disable nmi watchdog */ KERN_PANIC_ON_NMI=75, /* int: whether we will panic on an unrecovered */ + KERN_STOP_ON_CPU_LOST=76, /* int: SIGSTOP when a task losts its cpus */ }; diff -puN kernel/sched.c~stop-on-cpu-lost kernel/sched.c --- a/kernel/sched.c~stop-on-cpu-lost +++ a/kernel/sched.c @@ -4570,11 +4570,13 @@ wait_to_die: } #ifdef CONFIG_HOTPLUG_CPU +int sigstop_on_cpu_lost; /* Figure out where task on dead CPU should go, use force if neccessary. */ static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk) { int dest_cpu; cpumask_t mask; + int force = 0; /* On same node? */ mask = node_to_cpumask(cpu_to_node(dead_cpu)); @@ -4599,8 +4601,20 @@ static void move_task_off_dead_cpu(int d printk(KERN_INFO "process %d (%s) no " "longer affine to cpu%d\n", tsk->pid, tsk->comm, dead_cpu); + /* + * This thread is not properly reconfigurated before cpu hot + * remove. This means this process is in the wrong state now. + * If system management policy doesn't allow mis-configurated + * applications, this process should be stopped. + */ + if (tsk->mm && sigstop_on_cpu_lost) + force = 1; } __migrate_task(tsk, dead_cpu, dest_cpu); + + if (force) { + force_sig_specific(SIGSTOP, tsk); + } } /* diff -puN kernel/sysctl.c~stop-on-cpu-lost kernel/sysctl.c --- a/kernel/sysctl.c~stop-on-cpu-lost +++ a/kernel/sysctl.c @@ -130,6 +130,10 @@ extern int sysctl_hz_timer; extern int acct_parm[]; #endif +#ifdef CONFIG_HOTPLUG_CPU +extern int sigstop_on_cpu_lost; +#endif + #ifdef CONFIG_IA64 extern int no_unaligned_warning; #endif @@ -705,6 +709,16 @@ static ctl_table kern_table[] = { .proc_handler = &proc_dointvec, }, #endif +#ifdef CONFIG_HOTPLUG_CPU + { + .ctl_name = KERN_STOP_ON_CPU_LOST, + .procname = "sigstop_on_cpu_lost", + .data = &sigstop_on_cpu_lost, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, +#endif { .ctl_name = 0 } }; _ Patches currently in -mm which might be from kamezawa.hiroyu@xxxxxxxxxxxxxx are for_each_possible_cpu-xfs.patch git-acpi.patch acpi-memory-hotplug-cannot-manage-_crs-with-plural-resoureces.patch zone-handle-unaligned-zone-boundaries.patch wait_table-and-zonelist-initializing-for-memory-hotadd-change-name-of-wait_table_size.patch wait_table-and-zonelist-initializing-for-memory-hotaddadd-return-code-for-init_current_empty_zone.patch wait_table-and-zonelist-initializing-for-memory-hotadd-wait_table-initialization.patch wait_table-and-zonelist-initializing-for-memory-hotadd-update-zonelists.patch support-for-panic-at-oom.patch pgdat-allocation-for-new-node-add-generic-alloc-node_data.patch pgdat-allocation-for-new-node-add-refresh-node_data.patch pgdat-allocation-for-new-node-add-export-kswapd-start-func.patch pgdat-allocation-for-new-node-add-export-kswapd-start-func-fix.patch pgdat-allocation-for-new-node-add-call-pgdat-allocation.patch register-hot-added-memory-to-iomem-resource.patch catch-valid-mem-range-at-onlining-memory.patch node-hotplug-register-cpu-remove-node-struct.patch node-hotplug-register-cpu-remove-node-struct-alpha-fix.patch initialise-total_memory-earlier.patch update-vm_total_pages-at-memory-hotadd.patch page-migration-simplify-migrate_pages.patch page-migration-handle-freeing-of-pages-in-migrate_pages.patch page-migration-use-allocator-function-for-migrate_pages.patch page-migration-support-moving-of-individual-pages.patch remove-empty-node-at-boot-time.patch stop-on-cpu-lost.patch stop-on-cpu-lost-tidy.patch namespaces-utsname-sysctl-hack-cleanup-2-fix.patch reiser4-hardirq-include-fix.patch genirq-rename-desc-handler-to-desc-chip-ia64-fix-2.patch - To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html