Add the syscall with explanation of the operations. Signed-off-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> --- Documentation/admin-guide/mm/soft-dirty.rst | 48 ++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/mm/soft-dirty.rst b/Documentation/admin-guide/mm/soft-dirty.rst index cb0cfd6672fa..030d75658010 100644 --- a/Documentation/admin-guide/mm/soft-dirty.rst +++ b/Documentation/admin-guide/mm/soft-dirty.rst @@ -5,7 +5,12 @@ Soft-Dirty PTEs =============== The soft-dirty is a bit on a PTE which helps to track which pages a task -writes to. In order to do this tracking one should +writes to. + +Using Proc FS +------------- + +In order to do this tracking one should 1. Clear soft-dirty bits from the task's PTEs. @@ -20,6 +25,47 @@ writes to. In order to do this tracking one should 64-bit qword is the soft-dirty one. If set, the respective PTE was written to since step 1. +Using System Call +----------------- + +process_memwatch system call can be used to find the dirty pages.:: + + long process_memwatch(int pidfd, unsigned long start, int len, + unsigned int flags, void *vec, int vec_len); + +The pidfd specifies the pidfd of process whose memory needs to be watched. +The calling process must have PTRACE_MODE_ATTACH_FSCREDS capabilities over +the process whose pidfd has been specified. It can be zero which means that +the process wants to watch its own memory. The operation is determined by +flags. The start argument must be a multiple of the system page size. The +len argument need not be a multiple of the page size, but since the +information is returned for the whole pages, len is effectively rounded +up to the next multiple of the page size. + +The vec is output array in which the offsets of the pages are returned. +Offset is calculated from start address. User lets the kernel know about the +size of the vec by passing size in vec_len. The system call returns when the +whole range has been searched or vec is completely filled. The whole range +isn't cleared if vec fills up completely. + +The flags argument specifies the operation to be performed. The MEMWATCH_SD_GET +and MEMWATCH_SD_CLEAR operations can be used separately or together to perform +MEMWATCH_SD_GET and MEMWATCH_SD_CLEAR atomically as one operation.:: + + MEMWATCH_SD_GET + Get the page offsets which are soft dirty. + + MEMWATCH_SD_CLEAR + Clear the pages which are soft dirty. + + MEMWATCH_SD_NO_REUSED_REGIONS + This optional flag can be specified in combination with other flags. + VM_SOFTDIRTY is ignored for the VMAs for performances reasons. This + flag shows only those pages dirty which have been written to by the + user. All new allocations aren't returned to be dirty. + +Explanation +----------- Internally, to do this tracking, the writable bit is cleared from PTEs when the soft-dirty bit is cleared. So, after this, when the task tries to -- 2.30.2