Re: [PATCH man-pages v5] madvise.2: add documentation for MADV_COLLAPSE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Alex!

On Tue, Nov 1, 2022 at 8:47 AM Alejandro Colomar <alx.manpages@xxxxxxxxx> wrote:
>
> Hi Zach!
>
> On 11/1/22 16:03, Zach OKeefe wrote:
> > From: Zach O'Keefe <zokeefe@xxxxxxxxxx>
> >
> > Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545
> > ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and
> > upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to
> > MADV_COLLAPSE").  Update the man-pages for madvise(2) and
> > process_madvise(2).
> >
> > Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@xxxxxxxxxx/
> > Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@xxxxxxxxxx/
> > Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
>
> Patch applied.
> See a minor edit below for curiosity.
> It was very nice to get this patch set improved and applied!
>

Awesome! Very excited to see it in :)

> Cheers,
>
> Alex
>
> > ---
> >
> > v4[1] -> v5
> > - Rebased to latest master
> > - (Alejandro Colomar) Applied diff to remove spurious file and fix
> >    semantic newlines.
> > - (Alejandro Colomar) Reworded documentation describing behavior of
> >    setting errno when multiple hugepage-aligned/sized regions fail to
> >    collapse.
> >
> > v3[2] -> v4
> > - Rebased to latest master
> > - (Alejandro Colomar) Fixed weird, non-ascii chars: e2 80 99 -> "'"
> > - (Alejandro Colomar) Replaced .BR with .B directive when the entire
> >    line was bold (no non-bold part)
> >
> > [1] https://lore.kernel.org/linux-man/20221031225500.3994542-1-zokeefe@xxxxxxxxxx/
> > [2] https://lore.kernel.org/linux-man/bb3b5c3c-3966-ea1a-6d84-4f7f3afa37ca@xxxxxxxxx/T/#u
> >
> >   man2/madvise.2         | 91 +++++++++++++++++++++++++++++++++++++++++-
> >   man2/process_madvise.2 | 10 +++++
> >   2 files changed, 99 insertions(+), 2 deletions(-)
> >
> > diff --git a/man2/madvise.2 b/man2/madvise.2
> > index edf805740..038e6023d 100644
> > --- a/man2/madvise.2
> > +++ b/man2/madvise.2
> > @@ -386,9 +386,10 @@ set (see
> >   .BR prctl (2)).
> >   .IP
> >   The
> > -.B MADV_HUGEPAGE
> > +.BR MADV_HUGEPAGE ,
> > +.BR MADV_NOHUGEPAGE ,
> >   and
> > -.B MADV_NOHUGEPAGE
> > +.B MADV_COLLAPSE
> >   operations are available only if the kernel was configured with
> >   .B CONFIG_TRANSPARENT_HUGEPAGE
> >   and file/shmem memory is only supported if the kernel was configured with
> > @@ -401,6 +402,82 @@ and
> >   .I length
> >   will not be backed by transparent hugepages.
> >   .TP
> > +.BR MADV_COLLAPSE " (since Linux 6.1)"
> > +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77
> > +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321
> > +Perform a best-effort synchronous collapse of
> > +the native pages mapped by the memory range
> > +into Transparent Huge Pages (THPs).
> > +.B MADV_COLLAPSE
> > +operates on the current state of memory of the calling process and
> > +makes no persistent changes or guarantees on how pages will be mapped,
> > +constructed,
> > +or faulted in the future.
> > +.IP
> > +.B MADV_COLLAPSE
> > +supports private anonymous pages (see
> > +.BR mmap (2)),
> > +shmem pages,
> > +and file-backed pages.
> > +See
> > +.B MADV_HUGEPAGE
> > +for general information on memory requirements for THP.
> > +If the range provided spans multiple VMAs,
> > +the semantics of the collapse over each VMA is independent from the others.
> > +If collapse of a given huge page-aligned/sized region fails,
> > +the operation may continue to attempt collapsing
> > +the remainder of the specified memory.
> > +.B MADV_COLLAPSE
> > +will automatically clamp the provided range to be hugepage-aligned.
> > +.IP
> > +All non-resident pages covered by the range
> > +will first be swapped/faulted-in,
> > +before being copied onto a freshly allocated hugepage.
> > +If the native pages compose the same PTE-mapped hugepage,
> > +and are suitably aligned,
> > +allocation of a new hugepage may be elided and
> > +collapse may happen in-place.
> > +Unmapped pages will have their data directly initialized to 0
> > +in the new hugepage.
> > +However,
> > +for every eligible hugepage-aligned/sized region to be collapsed,
> > +at least one page must currently be backed by physical memory.
> > +.IP
> > +.B MADV_COLLAPSE
> > +is independent of any sysfs
> > +(see
> > +.BR sysfs (5))
> > +setting under
> > +.IR /sys/kernel/mm/transparent_hugepage ,
> > +both in terms of determining THP eligibility,
> > +and allocation semantics.
> > +See Linux kernel source file
> > +.I Documentation/admin\-guide/mm/transhuge.rst
> > +for more information.
> > +.B MADV_COLLAPSE
> > +also ignores
> > +.B huge=
> > +tmpfs mount when operating on tmpfs files.
> > +Allocation for the new hugepage may enter direct reclaim and/or compaction,
> > +regardless of VMA flags
> > +(though
> > +.B VM_NOHUGEPAGE
> > +is still respected).
> > +.IP
> > +When the system has multiple NUMA nodes,
> > +the hugepage will be allocated from
> > +the node providing the most native pages.
> > +.IP
> > +If all hugepage-sized/aligned regions covered by the provided range were
> > +either successfully collapsed,
> > +or were already PMD-mapped THPs,
> > +this operation will be deemed successful.
> > +Note that this doesn't guarantee anything about
> > +other possible mappings of the memory.
> > +In the event multiple hugepage-aligned/sized areas fail to collapse,
> > +only the most recently-failed code will be set in
>
> I slightly changed the use of hyphens above with the following diff:
>
> diff --git a/man2/madvise.2 b/man2/madvise.2
> index 038e6023d..331465cfc 100644
> --- a/man2/madvise.2
> +++ b/man2/madvise.2
> @@ -475,7 +475,7 @@ .SS Linux-specific advice values
>   Note that this doesn't guarantee anything about
>   other possible mappings of the memory.
>   In the event multiple hugepage-aligned/sized areas fail to collapse,
> -only the most recently-failed code will be set in
> +only the most-recently\[en]failed code will be set in
>   .IR errno .
>   .TP
>   .BR MADV_DONTDUMP " (since Linux 3.4)"
>
>
> Rationale:
> <https://lists.gnu.org/archive/html/groff/2022-10/msg00019.html>
>

Appreciate the change, and thanks for the link -- very subtle. The
differences between various dashes (which I was only made aware of by
yourself) is still a little unclear to me.

Again, thank you so much for your patience and help throughout this
process -- I really appreciate it.

Best,
Zach

>
> > +.IR errno .
> > +.TP
> >   .BR MADV_DONTDUMP " (since Linux 3.4)"
> >   .\" commit 909af768e88867016f427264ae39d27a57b6a8ed
> >   .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519
> > @@ -620,6 +697,11 @@ A kernel resource was temporarily unavailable.
> >   .B EBADF
> >   The map exists, but the area maps something that isn't a file.
> >   .TP
> > +.B EBUSY
> > +(for
> > +.BR MADV_COLLAPSE )
> > +Could not charge hugepage to cgroup: cgroup limit exceeded.
> > +.TP
> >   .B EFAULT
> >   .I advice
> >   is
> > @@ -717,6 +799,11 @@ maximum resident set size.
> >   Not enough memory: paging in failed.
> >   .TP
> >   .B ENOMEM
> > +(for
> > +.BR MADV_COLLAPSE )
> > +Not enough memory: could not allocate hugepage.
> > +.TP
> > +.B ENOMEM
> >   Addresses in the specified range are not currently
> >   mapped, or are outside the address space of the process.
> >   .TP
> > diff --git a/man2/process_madvise.2 b/man2/process_madvise.2
> > index ac98850a9..92878286b 100644
> > --- a/man2/process_madvise.2
> > +++ b/man2/process_madvise.2
> > @@ -73,6 +73,10 @@ argument is one of the following values:
> >   See
> >   .BR madvise (2).
> >   .TP
> > +.B MADV_COLLAPSE
> > +See
> > +.BR madvise (2).
> > +.TP
> >   .B MADV_PAGEOUT
> >   See
> >   .BR madvise (2).
> > @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process
> >   .TP
> >   .B ESRCH
> >   The target process does not exist (i.e., it has terminated and been waited on).
> > +.PP
> > +See
> > +.BR madvise (2)
> > +for
> > +.IR advice -specific
> > +errors.
> >   .SH VERSIONS
> >   This system call first appeared in Linux 5.10.
> >   .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc
>
> --
> <http://www.alejandro-colomar.es/>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux