Hey Alex! On Mon, Oct 31, 2022 at 2:15 PM Alejandro Colomar <alx.manpages@xxxxxxxxx> wrote: > > Hi Zach! > > On 10/22/22 00:33, Zach OKeefe wrote: > > From: Zach O'Keefe <zokeefe@xxxxxxxxxx> > > > > Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545 > > ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and > > upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to > > MADV_COLLAPSE"). Update the man-pages for madvise(2) and > > process_madvise(2). > > > > Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@xxxxxxxxxx/ > > Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@xxxxxxxxxx/ > > Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx> > > There are a few issues with this patch: > > alx@asus5775:~/src/linux/man-pages/man-pages$ make lint-man-groff > LINT (groff) tmp/lint/man2/madvise.2.lint-man.groff.touch > eqn:man2/madvise.2:473: error: invalid input character code '128' > eqn:man2/madvise.2:473: error: invalid input character code '153' > an.tmac:man2/madvise.2:445: style: .BR expects at least 2 arguments, got 1 > an.tmac:man2/madvise.2:456: style: .BR expects at least 2 arguments, got 1 > an.tmac:man2/madvise.2:463: style: .BR expects at least 2 arguments, got 1 > found style problems; aborting > make: *** [lib/lint-man.mk:77: tmp/lint/man2/madvise.2.lint-man.groff.touch] Error 1 > > > Let's investigate them: > Thank you :) > alx@asus5775:~/src/linux/man-pages/man-pages$ sed -n 473p man2/madvise.2 > this operation will be deemed successful. > > This one was a bit difficult to track, since the line count seems to be off by one: > > alx@asus5775:~/src/linux/man-pages/man-pages$ tbl man2/madvise.2 | hd | grep -C1 > ' 80 ' > 00003d40 63 65 73 73 66 75 6c 2e 0a 4e 6f 74 65 20 74 68 |cessful..Note th| > 00003d50 61 74 20 74 68 69 73 20 64 6f 65 73 6e e2 80 99 |at this doesn...| > 00003d60 74 20 67 75 61 72 61 6e 74 65 65 20 61 6e 79 74 |t guarantee anyt| > alx@asus5775:~/src/linux/man-pages/man-pages$ sed -n 474p man2/madvise.2 > Note that this doesn’t guarantee anything about other possible mappings of > > The issue was in line 474, and the issue is that it uses a weird single quote. > Please use the foillowing ASCII character for the single quote (see ascii(7)): > 047 39 27 ' > Very weird and good find! Honestly, I had prototyped this in Google Docs and copy-pasta'd this over as the basis. I tried testing this again - and same thing - Google Docs uses some other character. Anyways - glad you caught this. > The rest of issues seems trivial: > Use .B instead of .BR because there's no "roman" (i.e., non-bold) part. > This was the first time it clicked what ".BR" meant: "bold followed by roman". > alx@asus5775:~/src/linux/man-pages/man-pages$ sed -n 445p man2/madvise.2 > .BR MADV_COLLAPSE > alx@asus5775:~/src/linux/man-pages/man-pages$ sed -n 456p man2/madvise.2 > .BR MADV_COLLAPSE > alx@asus5775:~/src/linux/man-pages/man-pages$ sed -n 463p man2/madvise.2 > .BR VM_NOHUGEPAGE > These didn't show up with my version of groff (as in 1/2), but I've applied the fixes and sent out a v4 for this patch. Again, thank you for all your help here! Best, Zach > > I'll report a bug to groff(1) about the issue with the line count. > Ya that's an odd one. Sorry for having to encounter this - must have been quite confusing. Thank you! > Cheers, > > Alex > > > --- > > man2/madvise.2 | 90 +++++++++++++++++++++++++++++++++++++++++- > > man2/process_madvise.2 | 10 +++++ > > 2 files changed, 98 insertions(+), 2 deletions(-) > > > > diff --git a/man2/madvise.2 b/man2/madvise.2 > > index df3413cc8..b03fc731d 100644 > > --- a/man2/madvise.2 > > +++ b/man2/madvise.2 > > @@ -385,9 +385,10 @@ set (see > > .BR prctl (2) ). > > .IP > > The > > -.B MADV_HUGEPAGE > > +.BR MADV_HUGEPAGE , > > +.BR MADV_NOHUGEPAGE , > > and > > -.B MADV_NOHUGEPAGE > > +.B MADV_COLLAPSE > > operations are available only if the kernel was configured with > > .B CONFIG_TRANSPARENT_HUGEPAGE > > and file/shmem memory is only supported if the kernel was configured with > > @@ -400,6 +401,81 @@ and > > .I length > > will not be backed by transparent hugepages. > > .TP > > +.BR MADV_COLLAPSE " (since Linux 6.1)" > > +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77 > > +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 > > +Perform a best-effort synchronous collapse of the native pages mapped by the > > +memory range into Transparent Huge Pages (THPs). > > +.B MADV_COLLAPSE > > +operates on the current state of memory of the calling process and makes no > > +persistent changes or guarantees on how pages will be mapped, > > +constructed, > > +or faulted in the future. > > +.IP > > +.B MADV_COLLAPSE > > +supports private anonymous pages (see > > +.BR mmap (2)), > > +shmem pages, > > +and file-backed pages. > > +See > > +.B MADV_HUGEPAGE > > +for general information on memory requirements for THP. > > +If the range provided spans multiple VMAs, > > +the semantics of the collapse over each VMA is independent from the others. > > +If collapse of a given huge page-aligned/sized region fails, > > +the operation may continue to attempt collapsing the remainder of the > > +specified memory. > > +.B MADV_COLLAPSE > > +will automatically clamp the provided range to be hugepage-aligned. > > +.IP > > +All non-resident pages covered by the range will first be > > +swapped/faulted-in, > > +before being copied onto a freshly allocated hugepage. > > +If the native pages compose the same PTE-mapped hugepage, > > +and are suitably aligned, > > +allocation of a new hugepage may be elided and collapse may happen > > +in-place. > > +Unmapped pages will have their data directly initialized to 0 in the new > > +hugepage. > > +However, > > +for every eligible hugepage-aligned/sized region to be collapsed, > > +at least one page must currently be backed by physical memory. > > +.IP > > +.BR MADV_COLLAPSE > > +is independent of any sysfs > > +(see > > +.BR sysfs (5)) > > +setting under > > +.IR /sys/kernel/mm/transparent_hugepage , > > +both in terms of determining THP eligibility, > > +and allocation semantics. > > +See Linux kernel source file > > +.I Documentation/admin\-guide/mm/transhuge.rst > > +for more information. > > +.BR MADV_COLLAPSE > > +also ignores > > +.B huge= > > +tmpfs mount when operating on tmpfs files. > > +Allocation for the new hugepage may enter direct reclaim and/or compaction, > > +regardless of VMA flags > > +(though > > +.BR VM_NOHUGEPAGE > > +is still respected). > > +.IP > > +When the system has multiple NUMA nodes, > > +the hugepage will be allocated from the node providing the most native > > +pages. > > +.IP > > +If all hugepage-sized/aligned regions covered by the provided range were > > +either successfully collapsed, > > +or were already PMD-mapped THPs, > > +this operation will be deemed successful. > > +Note that this doesn’t guarantee anything about other possible mappings of > > +the memory. > > +Also note that many failures might have occurred since the operation may > > +continue to collapse in the event collapse of a single hugepage-sized/aligned > > +region fails. > > +.TP > > .BR MADV_DONTDUMP " (since Linux 3.4)" > > .\" commit 909af768e88867016f427264ae39d27a57b6a8ed > > .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519 > > @@ -619,6 +695,11 @@ A kernel resource was temporarily unavailable. > > .B EBADF > > The map exists, but the area maps something that isn't a file. > > .TP > > +.B EBUSY > > +(for > > +.BR MADV_COLLAPSE ) > > +Could not charge hugepage to cgroup: cgroup limit exceeded. > > +.TP > > .B EFAULT > > .I advice > > is > > @@ -716,6 +797,11 @@ maximum resident set size. > > Not enough memory: paging in failed. > > .TP > > .B ENOMEM > > +(for > > +.BR MADV_COLLAPSE ) > > +Not enough memory: could not allocate hugepage. > > +.TP > > +.B ENOMEM > > Addresses in the specified range are not currently > > mapped, or are outside the address space of the process. > > .TP > > diff --git a/man2/process_madvise.2 b/man2/process_madvise.2 > > index 44d3b94e8..8b0ddccdd 100644 > > --- a/man2/process_madvise.2 > > +++ b/man2/process_madvise.2 > > @@ -73,6 +73,10 @@ argument is one of the following values: > > See > > .BR madvise (2). > > .TP > > +.B MADV_COLLAPSE > > +See > > +.BR madvise (2). > > +.TP > > .B MADV_PAGEOUT > > See > > .BR madvise (2). > > @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process > > .TP > > .B ESRCH > > The target process does not exist (i.e., it has terminated and been waited on). > > +.PP > > +See > > +.BR madvise (2) > > +for > > +.IR advice -specific > > +errors. > > .SH VERSIONS > > This system call first appeared in Linux 5.10. > > .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc > > -- > <http://www.alejandro-colomar.es/>