On 01/31/2017 07:27 AM, Dave Hansen wrote: > On 01/30/2017 05:36 PM, Anshuman Khandual wrote: >>> Let's say we had a CDM node with 100x more RAM than the rest of the >>> system and it was just as fast as the rest of the RAM. Would we still >>> want it isolated like this? Or would we want a different policy? >> >> But then the other argument being, dont we want to keep this 100X more >> memory isolated for some special purpose to be utilized by specific >> applications ? > > I was thinking that in this case, we wouldn't even want to bother with > having "system RAM" in the fallback lists. A device who got its memory System RAM is in the fallback list of the CDM node for the following purpose. If the user asks explicitly through mbind() and there is insufficient memory on the CDM node to fulfill the request. Then it is better to fallback on a system RAM memory node than to fail the request. This is in line with expectations from the mbind() call. There are other ways for the user space like /proc/pid/numa_maps to query about from where exactly a given page has come from in the runtime. But keeping options open I have noted down this in the cover letter. " FALLBACK zonelist creation: CDM node's FALLBACK zonelist can also be changed to accommodate other CDM memory zones along with system RAM zones in which case they can be used as fallback options instead of first depending on the system RAM zones when it's own memory falls insufficient during allocation. " > usage off by 1% could start to starve the rest of the system. A sane Did not get this point. Could you please elaborate more on this ? > policy in this case might be to isolate the "system RAM" from the device's. Hmm. > >>> Why do we need this hard-coded along with the cpuset stuff later in the >>> series. Doesn't taking a node out of the cpuset also take it out of the >>> fallback lists? >> >> There are two mutually exclusive approaches which are described in >> this patch series. >> >> (1) zonelist modification based approach >> (2) cpuset restriction based approach >> >> As mentioned in the cover letter, > > Well, I'm glad you coded both of them up, but now that we have them how > to we pick which one to throw to the wolves? Or, do we just merge both > of them and let one bitrot? ;) I am just trying to see how each isolation method stack up from benefit and cost point of view, so that we can have informed debate about their individual merit. Meanwhile I have started looking at if the core buddy allocator __alloc_pages_nodemask() and its interaction with nodemask at various stages can also be modified to implement the intended solution. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>