On Fri 06-11-20 12:32:44, Huang, Ying wrote: > Michal Hocko <mhocko@xxxxxxxx> writes: > > > On Thu 05-11-20 09:40:28, Feng Tang wrote: > >> On Wed, Nov 04, 2020 at 09:53:43AM +0100, Michal Hocko wrote: > >> > >> > > > As I've said in reply to your second patch. I think we can make the oom > >> > > > killer behavior more sensible in this misconfigured cases but I do not > >> > > > think we want break the cpuset isolation for such a configuration. > >> > > > >> > > Do you mean we skip the killing and just let the allocation fail? We've > >> > > checked the oom killer code first, when the oom happens, both DRAM > >> > > node and unmovable node have lots of free memory, and killing process > >> > > won't improve the situation. > >> > > >> > We already do skip oom killer and fail for lowmem allocation requests already. > >> > This is similar in some sense. Another option would be to kill the > >> > allocating context which will have less corner cases potentially because > >> > some allocation failures might be unexpected. > >> > >> Yes, this can avoid the helpless oom killing to kill a good process(no > >> memory pressure at all) > >> > >> And I think the important thing is to judge whether this usage (binding > >> docker like workload to unmovable node) is a valid case :) > > > > I am confused. Why wouldbe an unmovable node a problem. Movable > > allocations can be satisfied from the Zone Normal just fine. It is other > > way around that is a problem. > > > >> Initially, I thought it invalid too, but later think it still makes some > >> sense for the 2 cases: > >> * user want to bind his workload to one node(most of user space > >> memory) to avoid cross-node traffic, and that node happens to > >> be configured as unmovable > > > > See above > > > >> * one small DRAM node + big PMEM node, and memory latency insensitive > >> workload could be bound to the cheaper unmovable PMEM node > > > > Please elaborate some more. As long as you have movable and normal nodes > > then this should be possible with a deal of care - most notably the > > movable:kernel ratio memory shouldn't be too big. > > > > Besides that why does PMEM node have to be MOVABLE only in the first > > place? > > The performance of PMEM is much worse than that of DRAM. If we found > that some pages on PMEM are accessed frequently (hot), we may want to > move them to DRAM to optimize the system performance. If the unmovable > pages are allocated on PMEM and hot, it's possible that we cannot move > the pages to DRAM unless rebooting the system. So we think we should > make the PMEM nodes to be MOVABLE only. That is fair but then you really need a fallback node too. So this is mere optimization rather than a fundamental restriction. -- Michal Hocko SUSE Labs