On Wed, 13 Apr 2022 14:52:01 +0530 Jagdish Gediya <jvgediya@xxxxxxxxxxxxx> wrote: > Current implementation to find the demotion targets works > based on node state N_MEMORY, however some systems may have > dram only memory numa node which are N_MEMORY but not the > right choices as demotion targets. Why are they not the right choice? Please describe this fully so we can understand the motivation and end-user benefit of the proposed change. And please more fully describe the end-user benefits of this change. > This patch series introduces the new node state > N_DEMOTION_TARGETS, which is used to distinguish the nodes which > can be used as demotion targets, node_states[N_DEMOTION_TARGETS] > is used to hold the list of nodes which can be used as demotion > targets, support is also added to set the demotion target > list from user space so that default behavior can be overridden. Permanently extending the kernel ABI is a fairly big deal. Please fully explain the end-user value, usage scenarios, etc. What would go wrong if we simply omitted this interface? > node state N_DEMOTION_TARGETS is also set from the dax kmem > driver, certain type of memory which registers through dax kmem > (e.g. HBM) may not be the right choices for demotion so in future > they should be distinguished based on certain attributes and dax > kmem driver should avoid setting them as N_DEMOTION_TARGETS, > however current implementation also doesn't distinguish any > such memory and it considers all N_MEMORY as demotion targets > so this patch series doesn't modify the current behavior. > > Current code which sets migration targets is modified in > this patch series to avoid some of the limitations on the demotion > target sharing and to use N_DEMOTION_TARGETS only nodes while > finding demotion targets. > > Changelog > ---------- > > v2: > In v1, only 1st patch of this patch series was sent, which was > implemented to avoid some of the limitations on the demotion > target sharing, however for certain numa topology, the demotion > targets found by that patch was not most optimal, so 1st patch > in this series is modified according to suggestions from Huang > and Baolin. Different examples of demotion list comparasion > between existing implementation and changed implementation can > be found in the commit message of 1st patch. > > Jagdish Gediya (5): > mm: demotion: Set demotion list differently > mm: demotion: Add new node state N_DEMOTION_TARGETS > mm: demotion: Add support to set targets from userspace > device-dax/kmem: Set node state as N_DEMOTION_TARGETS > mm: demotion: Build demotion list based on N_DEMOTION_TARGETS > > .../ABI/testing/sysfs-kernel-mm-numa | 12 ++++ This description is rather brief. Some additional user-facing material under Documentation/ would help. Describe the format for writing to the file, what is seen when reading from it, provide a bit of help to the user so they can understand how to use it, what effects they might see, etc. > drivers/base/node.c | 4 ++ > drivers/dax/kmem.c | 2 + > include/linux/nodemask.h | 1 + > mm/migrate.c | 67 +++++++++++++++---- > 5 files changed, 72 insertions(+), 14 deletions(-)