On 11.4.2017 19:32, Christoph Lameter wrote: > On Tue, 11 Apr 2017, Vlastimil Babka wrote: > >> The task->il_next variable remembers the last allocation node for task's >> MPOL_INTERLEAVE policy. mpol_rebind_nodemask() updates interleave and >> bind mempolicies due to changing cpuset mems. Currently it also tries to >> make sure that current->il_next is valid within the updated nodemask. This is >> bogus, because 1) we are updating potentially any task's mempolicy, not just >> current, and 2) we might be updating per-vma mempolicy, not task one. >> >> The interleave_nodes() function that uses il_next can cope fine with the value >> not being within the currently allowed nodes, so this hasn't manifested as an >> actual issue. Thus it also won't be an issue if we just remove this adjustment >> completely. > > Well, interleave_nodes() will then potentially return a node outside of > the allowed memory policy when its called for the first time after > mpol_rebind_.. . But thenn it will find the next node within the > nodemask and work correctly for the next invocations. Hmm, you're right. But that could be easily fixed if il_next became il_prev, so we would return the result of next_node_in(il_prev) and also store it as the new il_prev, right? I somehow assumed it already worked that way. > But yea the race can probably be ignored. The idea was that the > application has a stable memory footprint during rebinding. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html