On Wed 17-04-19 10:26:05, Yang Shi wrote: > > > On 4/17/19 9:39 AM, Michal Hocko wrote: > > On Wed 17-04-19 09:37:39, Keith Busch wrote: > > > On Wed, Apr 17, 2019 at 05:39:23PM +0200, Michal Hocko wrote: > > > > On Wed 17-04-19 09:23:46, Keith Busch wrote: > > > > > On Wed, Apr 17, 2019 at 11:23:18AM +0200, Michal Hocko wrote: > > > > > > On Tue 16-04-19 14:22:33, Dave Hansen wrote: > > > > > > > Keith Busch had a set of patches to let you specify the demotion order > > > > > > > via sysfs for fun. The rules we came up with were: > > > > > > I am not a fan of any sysfs "fun" > > > > > I'm hung up on the user facing interface, but there should be some way a > > > > > user decides if a memory node is or is not a migrate target, right? > > > > Why? Or to put it differently, why do we have to start with a user > > > > interface at this stage when we actually barely have any real usecases > > > > out there? > > > The use case is an alternative to swap, right? The user has to decide > > > which storage is the swap target, so operating in the same spirit. > > I do not follow. If you use rebalancing you can still deplete the memory > > and end up in a swap storage. If you want to reclaim/swap rather than > > rebalance then you do not enable rebalancing (by node_reclaim or similar > > mechanism). > > I'm a little bit confused. Do you mean just do *not* do reclaim/swap in > rebalancing mode? If rebalancing is on, then node_reclaim just move the > pages around nodes, then kswapd or direct reclaim would take care of swap? Yes, that was the idea I wanted to get through. Sorry if that was not really clear. > If so the node reclaim on PMEM node may rebalance the pages to DRAM node? > Should this be allowed? Why it shouldn't? If there are other vacant Nodes to absorb that memory then why not use it? > I think both I and Keith was supposed to treat PMEM as a tier in the reclaim > hierarchy. The reclaim should push inactive pages down to PMEM, then swap. > So, PMEM is kind of a "terminal" node. So, he introduced sysfs defined > target node, I introduced N_CPU_MEM. I understand that. And I am trying to figure out whether we really have to tream PMEM specially here. Why is it any better than a generic NUMA rebalancing code that could be used for many other usecases which are not PMEM specific. If you present PMEM as a regular memory then also use it as a normal memory. -- Michal Hocko SUSE Labs