Re: [PATCH v1] madvise.2: Document MADV_POPULATE_READ and MADV_POPULATE_WRITE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> MADV_POPULATE_READ and MADV_POPULATE_WRITE have been merged into
> upstream Linux via commit 4ca9b3859dac ("mm/madvise: introduce
> MADV_POPULATE_(READ|WRITE) to prefault page tables"), part of v5.14-rc1.
>
> Let's document the behavior and error conditions of these new madvise()
> options.
>
> Cc: Alejandro Colomar <alx.manpages@xxxxxxxxx>
> Cc: Michael Kerrisk <mtk.manpages@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> Cc: Oscar Salvador <osalvador@xxxxxxx>
> Cc: Jann Horn <jannh@xxxxxxxxxx>
> Cc: Mike Rapoport <rppt@xxxxxxxxxx>
> Cc: Linux API <linux-api@xxxxxxxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
> ---
>  man2/madvise.2 | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
>
> diff --git a/man2/madvise.2 b/man2/madvise.2
> index f1f384c0c..3ec8c53a7 100644
> --- a/man2/madvise.2
> +++ b/man2/madvise.2
> @@ -469,6 +469,59 @@ If a page is file-backed and dirty, it will be written back to the backing
>  storage.
>  The advice might be ignored for some pages in the range when it is not
>  applicable.
> +.TP
> +.BR MADV_POPULATE_READ " (since Linux 5.14)
> +Populate (prefault) page tables readable for the whole range without actually
> +reading. Depending on the underlying mapping, map the shared zeropage,
> +preallocate memory or read the underlying file; files with holes might or
> +might not preallocate blocks.
> +Do not generate
> +.B SIGBUS
> +when populating fails, return an error instead.
> +.IP
> +If
> +.B MADV_POPULATE_READ
> +succeeds, all page tables have been populated (prefaulted) readable once.
> +If
> +.B MADV_POPULATE_READ
> +fails, some page tables might have been populated.
> +.IP
> +.B MADV_POPULATE_READ
> +cannot be applied to mappings without read permissions
> +and special mappings marked with the kernel-internal
> +.B VM_PFNMAP
> +and
> +.BR VM_IO .
> +.IP
> +Note that with
> +.BR MADV_POPULATE_READ ,
> +the process can be killed at any moment when the system runs out of memory.
> +.TP
> +.BR MADV_POPULATE_WRITE " (since Linux 5.14)
> +Populate (prefault) page tables writable for the whole range without actually
> +writing. Depending on the underlying mapping, preallocate memory or read the
> +underlying file; files with holes will preallocate blocks.
> +Do not generate
> +.B SIGBUS
> +when populating fails, return an error instead.
> +.IP
> +If
> +.B MADV_POPULATE_WRITE
> +succeeds, all page tables have been populated (prefaulted) writable once.
> +If
> +.B MADV_POPULATE_WRITE
> +fails, some page tables might have been populated.
> +.IP
> +.B MADV_POPULATE_WRITE
> +cannot be applied to mappings without write permissions
> +and special mappings marked with the kernel-internal
> +.B VM_PFNMAP
> +and
> +.BR VM_IO .
> +.IP
> +Note that
> +.BR MADV_POPULATE_WRITE ,
> +the process can be killed at any moment when the system runs out of memory.
>  .SH RETURN VALUE
>  On success,
>  .BR madvise ()
> @@ -533,6 +586,17 @@ or
>  .BR VM_PFNMAP
>  ranges.
>  .TP
> +.B EINVAL
> +.I advice
> +is
> +.B MADV_POPULATE_READ
> +or
> +.BR MADV_POPULATE_WRITE ,
> +but the specified address range includes ranges with insufficient permissions,
> +.B VM_IO
> +or
> +.BR VM_PFNMAP.
> +.TP
>  .B EIO
>  (for
>  .BR MADV_WILLNEED )
> @@ -548,6 +612,14 @@ Not enough memory: paging in failed.
>  Addresses in the specified range are not currently
>  mapped, or are outside the address space of the process.
>  .TP
> +.B ENOMEM
> +.I advice
> +is
> +.B MADV_POPULATE_READ
> +or
> +.BR MADV_POPULATE_WRITE ,
> +but populating (prefaulting) page tables failed.
> +.TP
>  .B EPERM
>  .I advice
>  is
> @@ -555,6 +627,14 @@ is
>  but the caller does not have the
>  .B CAP_SYS_ADMIN
>  capability.
> +.TP
> +.B EHWPOISON
> +.I advice
> +is
> +.B MADV_POPULATE_READ
> +or
> +.BR MADV_POPULATE_WRITE ,
> +and a HW poisoned page is encountered.
>  .SH VERSIONS
>  Since Linux 3.18,
>  .\" commit d3ac21cacc24790eb45d735769f35753f5b56ceb

>From the end user point of view, I find document simple and easy to understand.
Did not went deep into the implementation yet, just skimmed a bit.

Acked-by: Pankaj Gupta <pankaj.gupta@xxxxxxxxx>



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux