On Sun, Dec 11, 2022 at 1:55 PM Alejandro Colomar <alx.manpages@xxxxxxxxx> wrote: > > Hey Zach, > > On 12/11/22 22:51, Zach O'Keefe wrote: > > On Sun, Dec 11, 2022 at 9:59 AM Alejandro Colomar > > <alx.manpages@xxxxxxxxx> wrote: > >> > >> Hi Zach, > > > > Hey Alex, > > > >> On 10/22/22 00:33, Zach OKeefe wrote: > >>> From: Zach O'Keefe <zokeefe@xxxxxxxxxx> > >>> > >>> Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545 > >>> ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and > >>> upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to > >>> MADV_COLLAPSE"). Update the man-pages for madvise(2) and > >>> process_madvise(2). > >>> > >>> Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@xxxxxxxxxx/ > >>> Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@xxxxxxxxxx/ > >>> Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx> > >> > >> Please see a few comments below. > >> > > > > Thanks for the mail. So, this patch was taken as commit b106cd5bf > > ("madvise.2: add documentation for MADV_COLLAPSE"). Some of your > > comments below were > > applied (I think, by you) as fixes pre-commit. However, there are some > > new comments (or ones > > that address the same lines, but in different ways). Is this mail to > > log ~ what changes were done, > > or is there anything actionable here on my side? > > Ah no, it's just that I had it marked as unread for some reason, so I thought I > had forgotten to respond (and I forgot that I had applied it). :-) > > So, no action required. > > Regarding different suggestions, heh, it demonstrates that it's not exactly > deterministic :P > Heh -- no worries :) Thanks for following up! > Cheers, > > Alex > > P.S.: Do you know if I have anything missing from you or any of your collegues? At least on my part, I think you've taken all my patches (with help & edits -- thank you!). I can't speak for anyone else at Google, however (though, just a very hasty cross reference between git log and lore.kernel.org/linux-man seems to indicate patches sent from *@google.com since man-pages-6.00 have previously made it into man-pages-6.01, and nothing afterwards). Have a great rest of your weekend, Best, Zach > > > > > Best, > > Zach > > > > Thanks for this. > >> Cheers, > >> > >> Alex > >> > >>> --- > >>> man2/madvise.2 | 90 +++++++++++++++++++++++++++++++++++++++++- > >>> man2/process_madvise.2 | 10 +++++ > >>> 2 files changed, 98 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/man2/madvise.2 b/man2/madvise.2 > >>> index df3413cc8..b03fc731d 100644 > >>> --- a/man2/madvise.2 > >>> +++ b/man2/madvise.2 > >>> @@ -385,9 +385,10 @@ set (see > >>> .BR prctl (2) ). > >>> .IP > >>> The > >>> -.B MADV_HUGEPAGE > >>> +.BR MADV_HUGEPAGE , > >>> +.BR MADV_NOHUGEPAGE , > >>> and > >>> -.B MADV_NOHUGEPAGE > >>> +.B MADV_COLLAPSE > >>> operations are available only if the kernel was configured with > >>> .B CONFIG_TRANSPARENT_HUGEPAGE > >>> and file/shmem memory is only supported if the kernel was configured with > >>> @@ -400,6 +401,81 @@ and > >>> .I length > >>> will not be backed by transparent hugepages. > >>> .TP > >>> +.BR MADV_COLLAPSE " (since Linux 6.1)" > >>> +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77 > >>> +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 > >>> +Perform a best-effort synchronous collapse of the native pages mapped by the > >> > >> Please use semantic line breaks. In this case, I'd break after "pages". > >> > >> man-pages(7): > >> Use semantic newlines > >> In the source of a manual page, new sentences should be started on new > >> lines, long sentences should be split into lines at clause breaks (com‐ > >> mas, semicolons, colons, and so on), and long clauses should be split > >> at phrase boundaries. This convention, sometimes known as "semantic > >> newlines", makes it easier to see the effect of patches, which often > >> operate at the level of individual sentences, clauses, or phrases. > >> > >>> +memory range into Transparent Huge Pages (THPs). > >>> +.B MADV_COLLAPSE > >>> +operates on the current state of memory of the calling process and makes no > >> > >> Here I'd break after "and". > >> > >>> +persistent changes or guarantees on how pages will be mapped, > >>> +constructed, > >>> +or faulted in the future. > >>> +.IP > >>> +.B MADV_COLLAPSE > >>> +supports private anonymous pages (see > >>> +.BR mmap (2)), > >>> +shmem pages, > >>> +and file-backed pages. > >>> +See > >>> +.B MADV_HUGEPAGE > >>> +for general information on memory requirements for THP. > >>> +If the range provided spans multiple VMAs, > >>> +the semantics of the collapse over each VMA is independent from the others. > >>> +If collapse of a given huge page-aligned/sized region fails, > >>> +the operation may continue to attempt collapsing the remainder of the > >> > >> Break after "collapsing". > >> > >>> +specified memory. > >>> +.B MADV_COLLAPSE > >>> +will automatically clamp the provided range to be hugepage-aligned. > >>> +.IP > >>> +All non-resident pages covered by the range will first be > >> > >> Break after "range". > >> > >>> +swapped/faulted-in, > >>> +before being copied onto a freshly allocated hugepage. > >>> +If the native pages compose the same PTE-mapped hugepage, > >>> +and are suitably aligned, > >>> +allocation of a new hugepage may be elided and collapse may happen > >> > >> Break before or after "and". > >> > >>> +in-place. > >>> +Unmapped pages will have their data directly initialized to 0 in the new > >> > >> Break after "0". > >> > >>> +hugepage. > >>> +However, > >>> +for every eligible hugepage-aligned/sized region to be collapsed, > >>> +at least one page must currently be backed by physical memory. > >>> +.IP > >>> +.BR MADV_COLLAPSE > >> > >> s/BR/B/ > >> > >>> +is independent of any sysfs > >>> +(see > >>> +.BR sysfs (5)) > >>> +setting under > >>> +.IR /sys/kernel/mm/transparent_hugepage , > >>> +both in terms of determining THP eligibility, > >>> +and allocation semantics. > >>> +See Linux kernel source file > >>> +.I Documentation/admin\-guide/mm/transhuge.rst > >>> +for more information. > >>> +.BR MADV_COLLAPSE > >> > >> s/BR/B/ > >> > >>> +also ignores > >>> +.B huge= > >>> +tmpfs mount when operating on tmpfs files. > >>> +Allocation for the new hugepage may enter direct reclaim and/or compaction, > >>> +regardless of VMA flags > >>> +(though > >>> +.BR VM_NOHUGEPAGE > >> > >> s/BR/B/ > >> > >>> +is still respected). > >>> +.IP > >>> +When the system has multiple NUMA nodes, > >>> +the hugepage will be allocated from the node providing the most native > >> > >> Break after "from". > >> > >>> +pages. > >>> +.IP > >>> +If all hugepage-sized/aligned regions covered by the provided range were > >> > >> Prefer English rather than "/". > >> > >>> +either successfully collapsed, > >>> +or were already PMD-mapped THPs, > >>> +this operation will be deemed successful. > >>> +Note that this doesn’t guarantee anything about other possible mappings of > >> > >> Break after "about". > >> > >>> +the memory. > >>> +Also note that many failures might have occurred since the operation may > >>> +continue to collapse in the event collapse of a single hugepage-sized/aligned > >> > >> Add some omitted "that" or something that will help readability to > >> non-native-English readers. > >> > >> And break at a better place. > >> > >>> +region fails. > >>> +.TP > >>> .BR MADV_DONTDUMP " (since Linux 3.4)" > >>> .\" commit 909af768e88867016f427264ae39d27a57b6a8ed > >>> .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519 > >>> @@ -619,6 +695,11 @@ A kernel resource was temporarily unavailable. > >>> .B EBADF > >>> The map exists, but the area maps something that isn't a file. > >>> .TP > >>> +.B EBUSY > >>> +(for > >>> +.BR MADV_COLLAPSE ) > >>> +Could not charge hugepage to cgroup: cgroup limit exceeded. > >>> +.TP > >>> .B EFAULT > >>> .I advice > >>> is > >>> @@ -716,6 +797,11 @@ maximum resident set size. > >>> Not enough memory: paging in failed. > >>> .TP > >>> .B ENOMEM > >>> +(for > >>> +.BR MADV_COLLAPSE ) > >>> +Not enough memory: could not allocate hugepage. > >>> +.TP > >>> +.B ENOMEM > >>> Addresses in the specified range are not currently > >>> mapped, or are outside the address space of the process. > >>> .TP > >>> diff --git a/man2/process_madvise.2 b/man2/process_madvise.2 > >>> index 44d3b94e8..8b0ddccdd 100644 > >>> --- a/man2/process_madvise.2 > >>> +++ b/man2/process_madvise.2 > >>> @@ -73,6 +73,10 @@ argument is one of the following values: > >>> See > >>> .BR madvise (2). > >>> .TP > >>> +.B MADV_COLLAPSE > >>> +See > >>> +.BR madvise (2). > >>> +.TP > >>> .B MADV_PAGEOUT > >>> See > >>> .BR madvise (2). > >>> @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process > >>> .TP > >>> .B ESRCH > >>> The target process does not exist (i.e., it has terminated and been waited on). > >>> +.PP > >>> +See > >>> +.BR madvise (2) > >>> +for > >>> +.IR advice -specific > >>> +errors. > >>> .SH VERSIONS > >>> This system call first appeared in Linux 5.10. > >>> .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc > >> > >> -- > >> <http://www.alejandro-colomar.es/> > > -- > <http://www.alejandro-colomar.es/>