Re: [PATCH] mm,madvise,hugetlb: check for 0-length range after end address adjustment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 31 Jan 2025 15:37:49 +0100 Ricardo Cañuelo Navarro <rcn@xxxxxxxxxx> wrote:

> Add a sanity check to madvise_dontneed_free() to address a corner case
> in madvise where a race condition causes the current vma being processed
> to be backed by a different page size.
> 
> During a madvise(MADV_DONTNEED) call on a memory region registered with
> a userfaultfd, there's a period of time where the process mm lock is
> temporarily released in order to send a UFFD_EVENT_REMOVE and let
> userspace handle the event. During this time, the vma covering the
> current address range may change due to an explicit mmap done
> concurrently by another thread.
> 
> If, after that change, the memory region, which was originally backed by
> 4KB pages, is now backed by hugepages, the end address is rounded down
> to a hugepage boundary to avoid data loss (see "Fixes" below). This
> rounding may cause the end address to be truncated to the same address
> as the start.
> 
> Make this corner case follow the same semantics as in other similar
> cases where the requested region has zero length (ie. return 0).
> 
> This will make madvise_walk_vmas() continue to the next vma in the
> range (this time holding the process mm lock) which, due to the prev
> pointer becoming stale because of the vma change, will be the same
> hugepage-backed vma that was just checked before. The next time
> madvise_dontneed_free() runs for this vma, if the start address isn't
> aligned to a hugepage boundary, it'll return -EINVAL, which is also in
> line with the madvise api.
> 
> >From userspace perspective, madvise() will return EINVAL because the
> start address isn't aligned according to the new vma alignment
> requirements (hugepage), even though it was correctly page-aligned when
> the call was issued.
> 
> ...
>
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -933,7 +933,9 @@ static long madvise_dontneed_free(struct vm_area_struct *vma,
>  			 */
>  			end = vma->vm_end;
>  		}
> -		VM_WARN_ON(start >= end);
> +		if (start == end)
> +			return 0;
> +		VM_WARN_ON(start > end);
>  	}

Perhaps add a comment telling the user how this situation can come about?





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux