Hi Lee, On Thu, Aug 7, 2008 at 5:56 PM, Lee Schermerhorn <Lee.Schermerhorn@xxxxxx> wrote: > On Thu, 2008-08-07 at 15:34 +0200, Michael Kerrisk wrote: >> Lee, >> >> (Your patch numbering got a bit wonky. There were two 4's and two 6's, but >> no 5 or 7.) >> >> Depending what you think about the comments below, it may be worth revising >> this patch and resending. >> >> Lee Schermerhorn wrote: >> > Another attempt to rationalize description of MPOL_DEFAULT. >> > >> > Since ~2.6.25, the system default memory policy is "local allocation". >> >> So -- kernel behavior changed at 2.6.25? If so, do we not need to include >> some statement about before and after behavior? > > Actually, the behavior did not change. The implementation did. As > older man pages noted, MPOL_DEFAULT seemed to have a double meaning. > For task/process policy, it meant local allocation; for memory region > policy, installed by mbind(), it meant "fall back to task/process" > policy. Really, in both cases, it was just removing any policy for the > specified "scope" [process or memory range] so that the policy would > fall back to the "surrounding scope" [system or process]. > > Now, the only mempolicy in the kernel where MPOL_DEFAULT actually showed > up in the policy field was the system default policy. As a result, we > had to have a special case for MPOL_DEFAULT where ever we act on the > policy. But, we have another way to specify "local" allocation, both > internally and via the APIs--using MPOL_PREFERRED. We changed the > system default policy to use the "explicit local" policy and removed all > of the special case for MPOL_DEFAULT. So, now MPOL_DEFAULT exist only > as an API mode value that means "delete the mempolicy, if any, in the > specified scope so that it falls back to the surrounding scope". > > I explained this in the Documentation/vm/numa_memory_policy.txt doc. I > didn't think it was worth explaining mempolicy scope in the man pages. Okay -- I applied the patch for 3.07. Thanks! Michael >> > MPOL_DEFAULT itself is a request to remove any non-default policy and >> > "fall back" to the surrounding context. Try to say that without delving >> > into implementation details. >> > >> > Signed-off-by: Lee Schermerhorn <lee.schermerhorn@xxxxxx> >> > >> > man2/set_mempolicy.2 | 19 ++++++++++--------- >> > 1 files changed, 10 insertions(+), 9 deletions(-) >> > >> > Index: man-pages-3.05/man2/set_mempolicy.2 >> > =================================================================== >> > --- man-pages-3.05.orig/man2/set_mempolicy.2 2008-07-29 16:49:36.000000000 -0400 >> > +++ man-pages-3.05/man2/set_mempolicy.2 2008-07-29 16:50:22.000000000 -0400 >> > @@ -99,15 +99,15 @@ A non-empty >> > specifies physical node ids. >> > Linux does will not remap the >> > .I nodemask >> > -when the task moves to a different cpuset context, >> > -nor when the set of nodes allowed by the task's >> > +when the process moves to a different cpuset context, >> > +nor when the set of nodes allowed by the process' >> > current cpuset context changes. >> > .TP >> > .B MPOL_F_RELATIVE_NODES >> > A non-empty >> > .I nodemask >> > specifies node ids that are relative to the set of >> > -node ids allowed by the task's current cpuset. >> > +node ids allowed by the process' current cpuset. >> > .PP >> > .I nodemask >> > points to a bit mask of node IDs that contains up to >> > @@ -132,7 +132,7 @@ argument is ignored. >> > Where a >> > .I nodemask >> > is required, it must contain at least one node that is on-line, >> > -allowed by the task's current cpuset context >> > +allowed by the process' current cpuset context >> > [unless the >> > .B MPOL_F_STATIC_NODES >> > mode flag is specified], >> > @@ -152,8 +152,10 @@ cpuset context includes one or more of t >> > >> > The >> > .B MPOL_DEFAULT >> > -mode is the default and means to allocate memory locally, >> > -i.e., on the node of the CPU that triggered the allocation. >> > +mode specifies that any non-default process memory policy be removed >> > +and "fall back" to the system default policy. >> > +The system default policy is "local allocation"-- >> > +i.e., allocate memory on the node of the CPU that triggered the allocation. >> > .I nodemask >> > must be specified as NULL. >> > If the "local node" contains no free memory, the system will >> > @@ -203,9 +205,8 @@ If the >> > .I nodemask >> > and >> > .I maxnode >> > -arguments specify the empty set, then the memory is allocated on >> > -the node of the CPU that triggered the allocation (like >> > -.BR MPOL_DEFAULT ). >> > +arguments specify the empty set, then the policy specifies >> > +explicit local allocation. >> >> Was there a reason to lose the words " allocated on the node of the CPU that >> triggered the allocation"? Are they no longer true? If they are still true, >> it seems helpful to keep them (or something similar), since "explicit local >> allocation" seems a little harder to understand. > > Well, I had defined [sort of] "local allocation" further up when > discussing the behavior of MPOL_DEFAULT and the system default > mempolicy. So, I thought it was redundant to keep that long phrase > here. You could just drop the "explicit" and maybe add quotes around > "local allocation" and even add "like the system default policy > discussed above". Whatever you think works is fine with me, tho'. > >> >> Cheers, >> >> Michael >> >> > The process memory policy is preserved across an >> > .BR execve (2), >> > >> > > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html