> We have /sys/devices/virtual/memory_tiering/memory_tier*/nodelist > already. A node in a higher tier can demote to any node in the lower > tiers. What's more need to be displayed in nodeX/demotion_nodes? IIRC, they are not the same. memory_tier[number], where the number is shared by the memory using the same memory driver(dax/kmem etc). Not reflect the actual distance across nodes(different distance will be grouped into the same memory_tier). But demotion will only select the nearest nodelist to demote. Below is an example, node0 node1 are DRAM, node2 node3 are PMEM, but distance to DRAM nodes are different. # numactl -H available: 4 nodes (0-3) node 0 cpus: 0 node 0 size: 964 MB node 0 free: 746 MB node 1 cpus: 1 node 1 size: 685 MB node 1 free: 455 MB node 2 cpus: node 2 size: 896 MB node 2 free: 897 MB node 3 cpus: node 3 size: 896 MB node 3 free: 896 MB node distances: node 0 1 2 3 0: 10 20 20 25 1: 20 10 25 20 2: 20 25 10 20 3: 25 20 20 10 # cat /sys/devices/system/node/node0/demotion_nodes 2 # cat /sys/devices/system/node/node1/demotion_nodes 3 # cat /sys/devices/virtual/memory_tiering/memory_tier22/nodelist 2-3 Thanks Zhijian (I hate the outlook native reply composition format.) ________________________________________ From: Huang, Ying <ying.huang@xxxxxxxxx> Sent: Thursday, November 2, 2023 11:17 To: Li, Zhijian/李 智坚 Cc: Andrew Morton; Greg Kroah-Hartman; rafael@xxxxxxxxxx; linux-mm@xxxxxxxxx; Gotou, Yasunori/五? 康文; linux-kernel@xxxxxxxxxxxxxxx Subject: Re: [PATCH RFC 1/4] drivers/base/node: Add demotion_nodes sys infterface Li Zhijian <lizhijian@xxxxxxxxxxx> writes: > It shows the demotion target nodes of a node. Export this information to > user directly. > > Below is an example where node0 node1 are DRAM, node3 is a PMEM node. > - Before PMEM is online, no demotion_nodes for node0 and node1. > $ cat /sys/devices/system/node/node0/demotion_nodes > <show nothing> > - After node3 is online as kmem > $ daxctl reconfigure-device --mode=system-ram --no-online dax0.0 && daxctl online-memory dax0.0 > [ > { > "chardev":"dax0.0", > "size":1054867456, > "target_node":3, > "align":2097152, > "mode":"system-ram", > "online_memblocks":0, > "total_memblocks":7 > } > ] > $ cat /sys/devices/system/node/node0/demotion_nodes > 3 > $ cat /sys/devices/system/node/node1/demotion_nodes > 3 > $ cat /sys/devices/system/node/node3/demotion_nodes > <show nothing> We have /sys/devices/virtual/memory_tiering/memory_tier*/nodelist already. A node in a higher tier can demote to any node in the lower tiers. What's more need to be displayed in nodeX/demotion_nodes? -- Best Regards, Huang, Ying > Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx> > --- > drivers/base/node.c | 13 +++++++++++++ > include/linux/memory-tiers.h | 6 ++++++ > mm/memory-tiers.c | 8 ++++++++ > 3 files changed, 27 insertions(+) > > diff --git a/drivers/base/node.c b/drivers/base/node.c > index 493d533f8375..27e8502548a7 100644 > --- a/drivers/base/node.c > +++ b/drivers/base/node.c > @@ -7,6 +7,7 @@ > #include <linux/init.h> > #include <linux/mm.h> > #include <linux/memory.h> > +#include <linux/memory-tiers.h> > #include <linux/vmstat.h> > #include <linux/notifier.h> > #include <linux/node.h> > @@ -569,11 +570,23 @@ static ssize_t node_read_distance(struct device *dev, > } > static DEVICE_ATTR(distance, 0444, node_read_distance, NULL); > > +static ssize_t demotion_nodes_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + int ret; > + nodemask_t nmask = next_demotion_nodes(dev->id); > + > + ret = sysfs_emit(buf, "%*pbl\n", nodemask_pr_args(&nmask)); > + return ret; > +} > +static DEVICE_ATTR_RO(demotion_nodes); > + > static struct attribute *node_dev_attrs[] = { > &dev_attr_meminfo.attr, > &dev_attr_numastat.attr, > &dev_attr_distance.attr, > &dev_attr_vmstat.attr, > + &dev_attr_demotion_nodes.attr, > NULL > }; > > diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h > index 437441cdf78f..8eb04923f965 100644 > --- a/include/linux/memory-tiers.h > +++ b/include/linux/memory-tiers.h > @@ -38,6 +38,7 @@ void init_node_memory_type(int node, struct memory_dev_type *default_type); > void clear_node_memory_type(int node, struct memory_dev_type *memtype); > #ifdef CONFIG_MIGRATION > int next_demotion_node(int node); > +nodemask_t next_demotion_nodes(int node); > void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets); > bool node_is_toptier(int node); > #else > @@ -46,6 +47,11 @@ static inline int next_demotion_node(int node) > return NUMA_NO_NODE; > } > > +static inline next_demotion_nodes next_demotion_nodes(int node) > +{ > + return NODE_MASK_NONE; > +} > + > static inline void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets) > { > *targets = NODE_MASK_NONE; > diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c > index 37a4f59d9585..90047f37d98a 100644 > --- a/mm/memory-tiers.c > +++ b/mm/memory-tiers.c > @@ -282,6 +282,14 @@ void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets) > rcu_read_unlock(); > } > > +nodemask_t next_demotion_nodes(int node) > +{ > + if (!node_demotion) > + return NODE_MASK_NONE; > + > + return node_demotion[node].preferred; > +} > + > /** > * next_demotion_node() - Get the next node in the demotion path > * @node: The starting node to lookup the next node