On Wed, Jan 05, 2011 at 05:00:43PM +0200, Vasileios Karakasis wrote: > Peeking inside the mremap() source, I can see that the kernel already > does this, i.e., mremap() preserves the policy of the original vm area. That is true. > > The problem is when the user has not specified a binding for the > original mapping (default policy), in which case copying explicitly the > policy from the old to the new pages won't work either; the new pages > will still have MPOL_DEFAULT. So realloc() cannot guarantee that the new It would be possible to do get_mempolicy MPOL_F_ADDR if policy == MPOL_DEFAULT: get_mempolicy MPOL_F_NODE|MPOL_F_ADDR, &node mbind MPOL_PREFERRED, node But then you end up with preferred instead of default. It should be usually the same, but may not in some corner cases. I guess you're right and that case is too obscure to care about. I guess your original patch without anything was good enough. It may be worth it to add some comments on this rationale though. > pages will be allocated on the same node as the preceding alloc(), > unless there is a way to obtain the actual node that the pages of the > original allocation were allocated on. In my opinion, this isn't a real > problem, because even the simple numa_alloc() using the default policy, > cannot guarantee that the pages will be allocated on the node of the > calling cpu: what if the task is migrated to a different cpu on a > different node, while touching (i.e., allocating) the pages with the > police_memory_int()? process policy and MPOL_DEFAULT are always just heuristics; such races can always occur. They usually should not because the scheduler does not migrate too frequently. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html