Re: [patch][resend] MAP_HUGETLB munmap fails with size not 2MB aligned

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 26 Mar 2015, Davide Libenzi wrote:

> > I looked at this thread at http://marc.info/?t=141392508800001 since I 
> > didn't have it in my mailbox, and I didn't get a chance to actually run 
> > your test code.
> > 
> > In short, I think what you're saying is that
> > 
> > 	ptr = mmap(..., 4KB, ..., MAP_HUGETLB | ..., ...)
> > 	munmap(ptr, 4KB) == EINVAL
> 
> I am not sure you have read the email correctly:
> 
> munmap(mmap(size, HUGETLB), size) = EFAIL
> 
> For every size not multiple of the huge page size.
> Whereas:
> 
> munmap(mmap(size, HUGETLB), ALIGN(size, HUGEPAGE_SIZE)) = OK
> 

Yes, I read it correctly, and wrote how your test case should have failed 
above.  It fails when you do the 4KB mmap() with MAP_HUGETLB and pass 4KB 
to munmap(), correct?

I have no idea what EFAIL is, though.

> > The question you pose is whether munmap(ptr, 4KB) should succeed for a 
> > hugetlb vma and in your patch you align this to the hugepage size of the 
> > vma in the same manner that munmap(ptr, 2KB) would be aligned to PAGE_SIZE 
> > for a non-hugetlb vma.
> > 
> > The munmap() spec says the whole pages that include any part of the passed 
> > length should be unmapped.  In spirit, I would agree with you that the 
> > page size for the vma is the hugepage size so that would be what would be 
> > unmapped.
> > 
> > But that's going by a spec that doesn't address hugepages and is worded in 
> > a way that {PAGE_SIZE} is the base unit that both mmap() and munmap() is 
> > done.  It carries no notion of variable page sizes and how hugepages 
> > should be handled with respect to pages of {PAGE_SIZE} length.  So I think 
> > this is beyond the scope of the spec: any length is aligned to PAGE_SIZE, 
> > but the munmap() behavior for hugetlb vmas is not restricted.
> > 
> > It would seem too dangerous at this point to change the behavior of 
> > munmap(ptr, 4KB) on a hugetlb vma and that userspace bugs could actually 
> > arise from aligning to the hugepage size.
> 
> You mean, there is an harder failure than the current failure? :)
> 

Yes, this munmap() behavior of lengths <= hugepage_size - PAGE_SIZE for a 
hugetlb vma is long standing and there may be applications that break as a 
result of changing the behavior: a database that reserves all allocated 
hugetlb memory with mmap() so that it always has exclusive access to those 
hugepages, whether they are faulted or not, and maintains its own hugepage 
pool (which is common), may test the return value of munmap() and depend 
on it returning -EINVAL to determine if it is freeing memory that was 
either dynamically allocated or mapped from the hugetlb reserved pool.

> > Some applications purposefully reserve hugetlb pages by mmap() and never 
> > munmap() them so they have exclusive access to hugepages that were 
> > allocated either at boot or runtime by the sysadmin.  If they depend on 
> > the return value of munmap() to determine if memory to free is memory 
> > dynamically allocated by the application or reserved as hugetlb memory, 
> > then this would cause them to break.  I can't say for certain that no such 
> > application exists.
> 
> The fact that certain applications will seldomly call an API, should be no 
> reason for API to have bugs, or at the very least, a bahviour which not 
> only in not documented in the man pages, but also totally unrespectful of 
> the normal mmap/munmap semantics.

If we were to go back in time and decide this when the munmap() behavior 
for hugetlb vmas was originally introduced, that would be valid.  The 
problem is that it could lead to userspace breakage and that's a 
non-starter.

What we can do is improve the documentation and man-page to clearly 
specify the long-standing behavior so that nobody encounters unexpected 
results in the future.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]