Re: [bug report?] unintuitive behavior when mapping over hugepage-backed PROT_NONE regions

Oscar Salvador <osalvador@xxxxxxx> · Thu, 6 Feb 2025 10:01:05 +0100

On Wed, Feb 05, 2025 at 11:18:34PM -0700, Uday Shankar wrote:
> I was debugging an issue with a malloc implementation when I noticed
> some unintuitive behavior that happens when someone attempts to
> overwrite part of a hugepage-backed PROT_NONE mapping with another
> mapping. I've isolated the issue and reproduced it with the following
> program:
...

> First, we map a 2G PROT_NONE region using hugepages. This succeeds. Then
> we try to map a 4096-length PROT_READ | PROT_WRITE region at the
> beginning of the PROT_NONE region, still using hugepages. This fails, as
> expected, because 4096 is much smaller than the hugepage size configured
> on the system (this is x86 with a default hugepage size of 2M). The

Not really, see how ksys_mmap_pgoff() aligns len to huge_page_size if we
set MAP_HUGETLB.
It fails with ENOMEM because likely you did not preallocate any hugetlb
pages, so by the time we do hugetlbfs_file_mmap()->hugetlb_reserve_pages(),
it sees that we do not have enough hugetlb pages in the pool to be reserved,
so it bails out.

> surprising thing is the difference in /proc/pid/smaps before and after
> the failed mmap. Even though the mmap failed, the value in
> /proc/pid/smaps changed, with a 2M-sized bite being taken out the front
> of the mapping. This feels unintuitive to me, as I'd expect a failed
> mmap to have no effect on the virtual memory mappings of the calling
> process whatsoever.

That is because the above happens after __mmap_prepare(), which is
responsible of unmapping any overlapping areas, is executed.
I guess this is done this way because rolling back at this point would be
quite tricky.

-- 
Oscar Salvador
SUSE Labs