On Fri, 02 Nov 2012 15:41:55 +0800 Wen Congyang <wency@xxxxxxxxxxxxxx> wrote: > At 11/02/2012 05:36 AM, David Rientjes Wrote: > > On Thu, 1 Nov 2012, Wen Congyang wrote: > > > >>> This doesn't describe why we need the new node state, unfortunately. It > >> > >> 1. Somethimes, we use the node which contains the memory that can be used by > >> kernel. > >> 2. Sometimes, we use the node which contains the memory. > >> > >> In case1, we use N_HIGH_MEMORY, and we use N_MEMORY in case2. > >> > > > > Yeah, that's clear, but the question is still _why_ we want two different > > nodemasks. I know that this part of the patchset simply introduces the > > new nodemask because the name "N_MEMORY" is more clear than > > "N_HIGH_MEMORY", but there's no real incentive for making that change by > > introducing a new nodemask where a simple rename would suffice. > > > > I can only assume that you want to later use one of them for a different > > purpose: those that do not include nodes that consist of only > > ZONE_MOVABLE. But that change for MPOL_BIND is nacked since it > > significantly changes the semantics of set_mempolicy() and you can't break > > userspace (see my response to that from yesterday). Until that problem is > > addressed, then there's no reason for the additional nodemask so nack on > > this series as well. I cannot locate "my response to that from yesterday". Specificity, please! > > I still think that we need two nodemasks: one store the node which has memory > that the kernel can use, and one store the node which has memory. > > For example: > > ========================== > static void *__meminit alloc_page_cgroup(size_t size, int nid) > { > gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN; > void *addr = NULL; > > addr = alloc_pages_exact_nid(nid, size, flags); > if (addr) { > kmemleak_alloc(addr, size, 1, flags); > return addr; > } > > if (node_state(nid, N_HIGH_MEMORY)) > addr = vzalloc_node(size, nid); > else > addr = vzalloc(size); > > return addr; > } > ========================== > If the node only has ZONE_MOVABLE memory, we should use vzalloc(). > So we should have a mask that stores the node which has memory that > the kernel can use. > > ========================== > static int mpol_set_nodemask(struct mempolicy *pol, > const nodemask_t *nodes, struct nodemask_scratch *nsc) > { > int ret; > > /* if mode is MPOL_DEFAULT, pol is NULL. This is right. */ > if (pol == NULL) > return 0; > /* Check N_HIGH_MEMORY */ > nodes_and(nsc->mask1, > cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]); > ... > if (pol->flags & MPOL_F_RELATIVE_NODES) > mpol_relative_nodemask(&nsc->mask2, nodes,&nsc->mask1); > else > nodes_and(nsc->mask2, *nodes, nsc->mask1); > ... > } > ========================== > If the user specifies 2 nodes: one has ZONE_MOVABLE memory, and the other one doesn't. > nsc->mask2 should contain these 2 nodes. So we should hava a mask that store the node > which has memory. > > There maybe something wrong in the change for MPOL_BIND. But this patchset is needed. Well, let's discuss the userspace-visible non-back-compatible mpol change. What is it, why did it happen, what is its impact, is it acceptable? I grabbed "PART1" and "PART2", but that's as far as I got with the six memory hotplug patch series. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>