[RFC PATCH 00/11] mm/mempolicy: Make task->mempolicy externally modifiable via syscall and procfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch set changes task->mempolicy to be modifiable by tasks other
than just current.

The ultimate goal is to make mempolicy more flexible and extensible,
such as adding interleave weights (which may need to change at runtime
due to hotplug events).  Making mempolicy externally modifiable allows
for userland daemons to make runtime performance adjustments to running
tasks without that software needing to be made numa-aware.

This initial RFC involves 3 major updates the mempolicy.

1. Refactor modifying interfaces to accept a task as an argument,
   and change existing callers to send `current` in to retain
   the existing behavior.

2. Change locking behaviors to ensure task->mpol is referenced
   safely by acquiring the task_lock where required.  Since
   allocators take the alloc lock (task lock), this successfully
   prevents changes from being made during allocations.

3. Add external interfaces which allow for a task mempolicy to be
   modified by another task.  This is implemented in 4 syscalls
   and a procfs interface:
        sys_set_task_mempolicy
        sys_get_task_mempolicy
        sys_set_task_mempolicy_home_node
        sys_task_mbind
        /proc/[pid]/mempolicy

The new syscalls are the same as their current-task counterparts,
except that they take a pid as an argument.  The exception is
task_mbind, which required a new struct due to the number of args.

The /proc/pid/mempolicy re-uses the interface mpol_parse_str format
to enable get/set of mempolicy via procsfs.

mpol_parse_str format:
            <mode>[=<flags>][:<nodelist>]

Example usage:

echo "default" > /proc/pid/mempolicy
echo "prefer=relative:0" > /proc/pid/mempolicy
echo "interleave:0-3" > /proc/pid/mempolicy

Changing the mempolicy does not induce memory migrations via the
procfs interface (which is the exact same behavior as set_mempolicy).

Signed-off-by: Gregory Price <gregory.price@xxxxxxxxxxxx>

Gregory Price (11):
  mm/mempolicy: refactor do_set_mempolicy for code re-use
  mm/mempolicy: swap cond reference counting logic in do_get_mempolicy
  mm/mempolicy: refactor set_mempolicy stack to take a task argument
  mm/mempolicy: modify get_mempolicy call stack to take a task argument
  mm/mempolicy: modify set_mempolicy_home_node to take a task argument
  mm/mempolicy: modify do_mbind to operate on task argument instead of
    current
  mm/mempolicy: add task mempolicy syscall variants
  mm/mempolicy: export replace_mempolicy for use by procfs
  mm/mempolicy: build mpol_parse_str unconditionally
  mm/mempolicy: mpol_parse_str should ignore trailing characters in
    nodelist
  fs/proc: Add mempolicy attribute to allow read/write of task mempolicy

 arch/x86/entry/syscalls/syscall_32.tbl |   4 +
 arch/x86/entry/syscalls/syscall_64.tbl |   4 +
 fs/proc/Makefile                       |   1 +
 fs/proc/base.c                         |   1 +
 fs/proc/internal.h                     |   1 +
 fs/proc/mempolicy.c                    | 117 +++++++
 include/linux/mempolicy.h              |  13 +-
 include/linux/syscalls.h               |  14 +
 include/uapi/asm-generic/unistd.h      |  10 +-
 include/uapi/linux/mempolicy.h         |  10 +
 mm/mempolicy.c                         | 432 +++++++++++++++++++------
 11 files changed, 502 insertions(+), 105 deletions(-)
 create mode 100644 fs/proc/mempolicy.c

-- 
2.39.1





[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux