At present, top_cpuset.mems_allowed is same as node_states[N_MEMORY] and it cannot be changed at the runtime. Maximum possible node_states[N_MEMORY] also gets reflected in top_cpuset.effective_mems interface. It prevents some one from removing or restricting memory placement which will be applicable system wide on a given memory node through cpuset mechanism which might be limiting. This solves the problem by enabling update_nodemask() function to accept changes to top_cpuset.mems_allowed as well. Once changed, it also updates the value of top_cpuset.effective_mems. Updates all it's task's mems_allowed nodemask as well. It calls cpuset_inc() to make sure cpuset is accounted for in the buddy allocator through cpusets_enabled() check. Signed-off-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx> --- Tested for * Enforcement of changed top_cpuset.mems_allowed * Global mems_allowed cannot be changed till there are other cpusets present underneath the top root cpuset. I guess it is expected. kernel/cpuset.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/kernel/cpuset.c b/kernel/cpuset.c index b308888..e8c105a 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -1210,15 +1210,6 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs, int retval; /* - * top_cpuset.mems_allowed tracks node_stats[N_MEMORY]; - * it's read-only - */ - if (cs == &top_cpuset) { - retval = -EACCES; - goto done; - } - - /* * An empty mems_allowed is ok iff there are no tasks in the cpuset. * Since nodelist_parse() fails on an empty mask, we special case * that parsing. The validate_change() call ensures that cpusets @@ -1232,7 +1223,7 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs, goto done; if (!nodes_subset(trialcs->mems_allowed, - top_cpuset.mems_allowed)) { + node_states[N_MEMORY])) { retval = -EINVAL; goto done; } @@ -1250,6 +1241,16 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs, cs->mems_allowed = trialcs->mems_allowed; spin_unlock_irq(&callback_lock); + if (cs == &top_cpuset) { + spin_lock_irq(&callback_lock); + cs->effective_mems = trialcs->mems_allowed; + spin_unlock_irq(&callback_lock); + + update_tasks_nodemask(cs); + cpuset_inc(); + goto done; + } + /* use trialcs->mems_allowed as a temp variable */ update_nodemasks_hier(cs, &trialcs->mems_allowed); done: -- 1.8.3.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>