On Fri, Feb 07, 2025 at 01:12:33PM +0000, Lorenzo Stoakes wrote: > > So TL;DR is - aggregate operations failing means any or all of the > operation failed, you can no longer rely on the mapping state being what > you expected. Coming back to the "what should the interface be?" question, I can see three reasonable answers: 1. Failure should result in no change. We have a bug and will fix it. 2. Failure should result in no change. But fixing things is exceedingly hard and we may have to live with current reality for a long time. 3. Failure should result in undefined behavior. I think you convincingly argue against the first answer. It might still be useful to also argue against the third answer. For background, I wrote a somewhat weird memory allocator in 2017, called "big_allocate". Underlying problem is that your favorite malloc tends to do a reasonable job for small to medium objects, but eventually gives up and calls mmap()/munmap() for large objects. With a heavily threaded process, the combination of mmap_sem and TLB shootdown via IPI is a big performance-killer. Solution is a specialized allocator for large objects instead of mmap()/munmap(). The original (and still current) design of big_allocate has a mapping structure somewhat similar to "struct page" in the kernel. It relies on having a large chunk of virtual memory space that it directly controls, so that it can have a simple 1:1 mapping between virtual memory and "struct page". To get a large chunk of virtual memory space, big_allocate does a MAP_NONE mmap(). It then later does the MAP_RW mmap() to allocate memory. Often combined with MAP_HUGETLB, for obvious performance reasons. (Side note: I wish MAP_RW existed in the headers.) If memory serves, big_allocate resulted in a 2-3% macrobenchmark improvement. Current big_allocate has a number of ugly warts I rather dislike. One of those warts is that you now have existing users that rely on mmap() over existing MAP_NONE mappings working. At least with the special set of conditions we care about. I have some plans to rewrite big_allocate with a different design. But for now we have existing code that may make your life harder than you wished for. Jörn -- Without congressional action or a strong judicial precedent, I would _strongly_ recommend against anyone trusting their private data to a company with physical ties to the United States. -- Ladar Levison