The patch titled Subject: mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-madv_collapse-dont-expand-collapse-when-vm_end-is-past-requested-end.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-madv_collapse-dont-expand-collapse-when-vm_end-is-past-requested-end.patch This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: "Zach O'Keefe" <zokeefe@xxxxxxxxxx> Subject: mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end Date: Sat, 24 Dec 2022 00:20:34 -0800 MADV_COLLAPSE acts on one hugepage-aligned/sized region at a time, until it has collapsed all eligible memory contained within the bounds supplied by the user. At the top of each hugepage iteration we (re)lock mmap_lock and (re)validate the VMA for eligibility and update variables that might have changed while mmap_lock was dropped. One thing that might occur is that the VMA could be resized, and as such, we refetch vma->vm_end to make sure we don't collapse past the end of the VMA's new end. However, it's possible that when refetching vma->vm_end that we expand the region acted on by MADV_COLLAPSE if vma->vm_end is greater than size+len supplied by the user. The consequence here is that we may attempt to collapse more memory than requested, possibly yielding either "too much success" or "false failure" user-visible results. An example of the former is if we MADV_COLLAPSE the first 4MiB of a 2TiB mmap()'d file, the incorrect refetch would cause the operation to block for much longer than anticipated as we attempt to collapse the entire TiB region. An example of the latter is that applying MADV_COLLPSE to a 4MiB file mapped to the start of a 6MiB VMA will successfully collapse the first 4MiB, then incorrectly attempt to collapse the last hugepage-aligned/sized region -- fail (since readahead/page cache lookup will fail) -- and report a failure to the user. Don't expand the acted-on region when refetching vma->vm_end. Link: https://lkml.kernel.org/r/20221224082035.3197140-1-zokeefe@xxxxxxxxxx Fixes: 4d24de9425f7 ("mm: MADV_COLLAPSE: refetch vm_end after reacquiring mmap_lock") Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx> Reported-by: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Yang Shi <shy828301@xxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- --- a/mm/khugepaged.c~mm-madv_collapse-dont-expand-collapse-when-vm_end-is-past-requested-end +++ a/mm/khugepaged.c @@ -2647,7 +2647,7 @@ int madvise_collapse(struct vm_area_stru goto out_nolock; } - hend = vma->vm_end & HPAGE_PMD_MASK; + hend = min(hend, vma->vm_end & HPAGE_PMD_MASK); } mmap_assert_locked(mm); memset(cc->node_load, 0, sizeof(cc->node_load)); _ Patches currently in -mm which might be from zokeefe@xxxxxxxxxx are mm-madv_collapse-dont-expand-collapse-when-vm_end-is-past-requested-end.patch mm-shmem-restore-shmem_huge_deny-precedence-over-madv_collapse.patch