On Wed, Feb 05, 2025 at 11:18:34PM -0700, Uday Shankar wrote: > I was debugging an issue with a malloc implementation when I noticed > some unintuitive behavior that happens when someone attempts to > overwrite part of a hugepage-backed PROT_NONE mapping with another > mapping. I've isolated the issue and reproduced it with the following > program: ... > First, we map a 2G PROT_NONE region using hugepages. This succeeds. Then > we try to map a 4096-length PROT_READ | PROT_WRITE region at the > beginning of the PROT_NONE region, still using hugepages. This fails, as > expected, because 4096 is much smaller than the hugepage size configured > on the system (this is x86 with a default hugepage size of 2M). The Not really, see how ksys_mmap_pgoff() aligns len to huge_page_size if we set MAP_HUGETLB. It fails with ENOMEM because likely you did not preallocate any hugetlb pages, so by the time we do hugetlbfs_file_mmap()->hugetlb_reserve_pages(), it sees that we do not have enough hugetlb pages in the pool to be reserved, so it bails out. > surprising thing is the difference in /proc/pid/smaps before and after > the failed mmap. Even though the mmap failed, the value in > /proc/pid/smaps changed, with a 2M-sized bite being taken out the front > of the mapping. This feels unintuitive to me, as I'd expect a failed > mmap to have no effect on the virtual memory mappings of the calling > process whatsoever. That is because the above happens after __mmap_prepare(), which is responsible of unmapping any overlapping areas, is executed. I guess this is done this way because rolling back at this point would be quite tricky. -- Oscar Salvador SUSE Labs