Re: [bug report?] unintuitive behavior when mapping over hugepage-backed PROT_NONE regions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 07, 2025 at 01:12:33PM +0000, Lorenzo Stoakes wrote:
> 
> So TL;DR is - aggregate operations failing means any or all of the
> operation failed, you can no longer rely on the mapping state being what
> you expected.

Coming back to the "what should the interface be?" question, I can see
three reasonable answers:
1. Failure should result in no change.  We have a bug and will fix it.
2. Failure should result in no change.  But fixing things is exceedingly
   hard and we may have to live with current reality for a long time.
3. Failure should result in undefined behavior.

I think you convincingly argue against the first answer.  It might still
be useful to also argue against the third answer.


For background, I wrote a somewhat weird memory allocator in 2017,
called "big_allocate".  Underlying problem is that your favorite malloc
tends to do a reasonable job for small to medium objects, but eventually
gives up and calls mmap()/munmap() for large objects.  With a heavily
threaded process, the combination of mmap_sem and TLB shootdown via IPI
is a big performance-killer.  Solution is a specialized allocator for
large objects instead of mmap()/munmap().

The original (and still current) design of big_allocate has a mapping
structure somewhat similar to "struct page" in the kernel.  It relies on
having a large chunk of virtual memory space that it directly controls,
so that it can have a simple 1:1 mapping between virtual memory and
"struct page".

To get a large chunk of virtual memory space, big_allocate does a
MAP_NONE mmap().  It then later does the MAP_RW mmap() to allocate
memory.  Often combined with MAP_HUGETLB, for obvious performance
reasons.  (Side note: I wish MAP_RW existed in the headers.)

If memory serves, big_allocate resulted in a 2-3% macrobenchmark
improvement.

Current big_allocate has a number of ugly warts I rather dislike.  One
of those warts is that you now have existing users that rely on mmap()
over existing MAP_NONE mappings working.  At least with the special set
of conditions we care about.

I have some plans to rewrite big_allocate with a different design.  But
for now we have existing code that may make your life harder than you
wished for.

Jörn

--
Without congressional action or a strong judicial precedent, I would
_strongly_ recommend against anyone trusting their private data to a
company with physical ties to the United States.
-- Ladar Levison




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux