On Wed, 2008-07-30 at 00:04 +0200, Michael Kerrisk wrote: > Hi Lee > > On Tue, Jul 29, 2008 at 11:54 PM, Lee Schermerhorn > <Lee.Schermerhorn@xxxxxx> wrote: > > Michael: > > > > The numactl package contains two man pages that describe core kernel > > features. These are: > > > > move_pages.2 - system call doc. > > > > numa_maps.5 - /proc/<pid>/numa_maps documentation. > > > > IMO, these should be moved into the kernel man pages sources. All the > > other man pages in the numactl package describe user space libs and > > tools, so I think they should stay with the package. > > > > What do you think? > > Thanks for bringing this up Lee. I'ts been on my TODO list for a > while to do exactly what you suggest. Can you remove these from the > numa package, and send the current versions to me? Michael: I have attached the two kernel man pages from the numactl package [numactl-2.0.2]. I have some proposed updates to these pages that I'll send along shortly. Lee
.\" Hey Emacs! This file is -*- nroff -*- source. .\" .\" This manpage is Copyright (C) 2006 Silicon Graphics, Inc. .\" Christoph Lameter .\" .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .TH MOVE_PAGES 2 2006-10-31 "Linux 2.6.18" "Linux Programmer's Manual" .SH NAME move_pages \- Move individual pages of a process to another node .SH SYNOPSIS .B #include <numaif.h> .sp .BI "long move_pages(int " pid ", unsigned long count, void ** " pages ", const int * " nodes ", int * " status ", int " flags ); .SH DESCRIPTION .BR move_pages () moves .I count pages to the .I nodes. The result of the move is reflected in .I status. The .I flags indicate constraints on the pages to be moved. .I pid is the process id in which pages are to be moved. Sufficient rights must exist to move pages of another process. This means the moving process either has root priviledges, has SYS_NICE administrative rights or the same owner. If pid is 0 then we move pages of the current process. .I count is the number of pages to move. It defines the size of the three arrays .I pages, .I nodes and .I status. .I pages is an array of pointers to the pages that should be moved. These are pointers that should be aligned to page boundaries. Addresses are specified as seen by the process specified by .I pid. .I nodes is either an array of integers that specify the desired location for each page or it is NULL. Each integer is a node number. If NULL is specified then move_pages will not move any pages but return the node of each page in the .I status array. Having the status of each page may be necessary to determine pages that need to be moved. .I status is an array of integers that return the status of each page. The array only contains valid values if .I move_pages did not return an error code. .I flags specify what types of pages to move. .B MPOL_MF_MOVE means that only pages that are in exclusive use by the process are to be moved. .B MPOL_MF_MOVE_ALL means that pages shared between multiple processes can also be moved. The process must have root priviledges or SYS_NICE priviledges. .SH Page states in the status array .TP .B 0..MAX_NUMNODES Indicates that the location of the page is on this node. .TP .B -ENOENT The page is not present. .TP .B -EACCES The page is mapped by multiple processes and can only be moved if .I MPOL_MF_MOVE_ALL is specified. .TP .B -EBUSY The page is currently busy and cannot be moved. Try again later. This occurs if a page is undergoing I/O or another kernel subsystem is holding a reference to the page. .TP .B -EFAULT This is a zero page or the memory area is not mapped by the process. .TP .B -ENOMEM Unable to allocate memory on target node. .TP .B -EIO Unable to write back a page. The page has to be written back in order to move ti since the page is dirty and the filesystem has not provide a migration function that would allow the move of dirty pages. .TP .B -EINVAL A dirty page cannot be moved. The filesystem does not provide a migration function and has no ability to write back pages. .SH "RETURN VALUE" On success .B move_pages returns zero. .SH ERRORS .TP .B -ENOENT No pages were found that require moving. All pages are either already on the target node, not present, had an invalid address or could not be moved because they were mapped by multiple processes. .TP .B -EINVAL Flags other than .I MPOL_MF_MOVE and .I MPOL_MF_MOVE_ALL was specified or an attempt was made to migrate pages of a kernel thread. .TP .B -EPERM .I MPOL_MF_MOVE_ALL specified without sufficient privileges or an attempt to move a process belonging to another user. .TP .B -EACCESS On of the target nodes is not allowed by the current cpuset. .TP .B -ENODEV On of the target nodes is not online. .TP .B -ESRCH Process does not exist. .TP .B -E2BIG Too many pages to move. .TP .B -EFAULT Parameter array could not be accessed. .SH "NOTES" Use .BR get_mempolicy (2) with the .B MPOL_F_MEMS_ALLOWED flag to obtain the set of nodes that are allowed by the current cpuset. Note that this information is subject to change at any time by manual or automatic reconfiguration of the cpuset. Use of this function may result in pages whose location [node] violates the memory policy established for the specified addresses [See .BR mbind (2)] and/or the specified process [See .BR set_mempolicy (2)]. That is, memory policy does not constrain the destination nodes used by .BR move_pages (2). .SH "SEE ALSO" .BR cpuset (5), .BR get_mempolicy (2), .BR numa_maps (5), .BR migratepages (8), .BR numa_stat (8), .BR numa (3)
.\" Copyright (c) 2005 Silicon Graphics Incorporated. .\" Christoph Lameter, <clameter@xxxxxxx>. .\" .TH NUMA_MAPS 5 "06 March 2006" "Linux 2.6" "Linux Programmer's Manual" .SH NAME numa_maps \- information about a process' numa memory policy and allocation .SH DESCRIPTION The file .B /proc/<pid>/numa_maps contains information about each memory range used by a given process, displaying--among other information--the effective memory policy for that memory range and on which nodes the pages have been allocated. .B numa_maps is a read-only file. When .B /proc/<pid>/numa_maps is read, the kernel will scan the virtual address space of the specified process and report how memory is used. One line is displayed for each unique memory range of the process. .P The first field of each line shows the starting address of the memory range. This field allows a correlation with contents of the .B /proc/<pid>/maps file which contains the end address of the range and other information, such as the access permissions and sharing. .P The second field shows the memory policy currently in effect for the memory range. Note that the effective policy is not necessarily the policy installed by the process for that memory range. Specifically, if the process installed a "default" policy for that range, the effective policy for that range will be the task policy which may or may not be "default". .P The rest of the line contains information about the pages allocated in the memory range. .DT .SS Possible information items .TP 1.5i .I N<node>=<nr_pages> The number of pages allocated on .IR <node> . .I <nr_pages> includes only pages currently mapped by the process. Page migration and memory reclaim may have temporarily unmapped pages associated with this memory range. These pages may only show up again after the process has attempted to reference them. If the memory range represents a shared memory area or file mapping, other processes may currently have additional pages mapped in a corresponding memory range. .TP 1.5i .I file=<filename> The file backing the memory range. If the file is mapped as private, write accesses may have generated COW (Copy-On-Write) pages in this memory range. These pages are displayed as anonymous pages. .TP 1.5i .I heap Memory range is used for the heap. .TP 1.5i .I stack Memory range is used for the stack. .TP 1.5i .I huge Huge memory range. The page counts shown are huge pages and not regular sized pages. .TP 1.5i .I anon=<pages> The number of anonymous page in the range. .TP 1.5i .I dirty=<pages> Number of dirty pages .TP 1.5i .I mapped=<pages> Total number of mapped pages, if different from .IR dirty and .I anon pages. .TP 1.5i .I mapmax=<count> Maximum mapcount (number of processes mapping a single page) encountered during the scan. This may be used as an indicator of the degree sharing occuring in a given memory range. .TP 1.5i .I swapcache=<count> Number of pages that have an associated entry on a swap device. .TP 1.5i .I active=<pages> The number of pages on the active list. This field is only shown if different from the number of pages in this range. This means that some inactive pages exist in the memory range that may be removed from memory by the swapper soon. .TP 1.5i .I writeback=<pages> Number of pages that are currently being written out to disk. .SH FILES .IR /proc/<pid>/numa_maps , .IR /proc/<pid>/maps . .SH "SEE ALSO" .BR set_mempolicy (2), .BR mbind (2), .BR migratepages (8), .BR numactl (8), .BR cpuset (8).