Hi, Yu, Yu Zhao <yuzhao@xxxxxxxxxx> writes: [snip] > > +static int get_swappiness(struct lruvec *lruvec, struct scan_control *sc) > +{ > + struct mem_cgroup *memcg = lruvec_memcg(lruvec); > + struct pglist_data *pgdat = lruvec_pgdat(lruvec); > + > + if (!can_demote(pgdat->node_id, sc) && > + mem_cgroup_get_nr_swap_pages(memcg) < MIN_LRU_BATCH) > + return 0; > + > + return mem_cgroup_swappiness(memcg); > +} > + We have tested v9 for memory tiering system, the demotion works now even without swap devices configured. Thanks! And we found that the demotion (page reclaiming on DRAM nodes) speed is lower than the original implementation. The workload itself is just a memory accessing micro-benchmark with Gauss distribution. It is run on a system with DRAM and PMEM. Initially, quite some hot pages are placed in PMEM and quite some cold pages are placed in DRAM. Then the page placement optimizing mechanism based on NUMA balancing will try to promote some hot pages from PMEM node to DRAM node. If the DRAM node near full (reach high watermark), kswapd of the DRAM node will be woke up to demote (reclaim) some cold DRAM pages to PMEM. Because quite some pages on DRAM is very cold (not accessed for at least several seconds), the benchmark performance will be better if demotion speed is faster. Some data comes from /proc/vmstat and perf-profile is as follows.