> With enough pages at once, though, munmap would be fine, too. That implies lots of page faults and zeroing though. The zeroing alone is a major performance issue. There are separate issues with munmap since it ends up resulting in a lot more virtual memory fragmentation. It would help if the kernel used first-best-fit for mmap instead of the current naive algorithm (bonus: O(log n) worst-case, not O(n)). Since allocators like jemalloc and PartitionAlloc want 2M aligned spans, mixing them with other allocators can also accelerate the VM fragmentation caused by the dumb mmap algorithm (i.e. they make a 2M aligned mapping, some other mmap user does 4k, now there's a nearly 2M gap when the next 2M region is made and the kernel keeps going rather than reusing it). Anyway, that's a totally separate issue from this. Just felt like complaining :). > Maybe what's really needed is a MADV_FREE variant that takes an iovec. > On an all-cores multithreaded mm, the TLB shootdown broadcast takes > thousands of cycles on each core more or less regardless of how much > of the TLB gets zapped. That would work very well. The allocator ends up having a sequence of dirty spans that it needs to purge in one go. As long as purging is fairly spread out, the cost of a single TLB shootdown isn't that bad. It is extremely bad if it needs to do it over and over to purge a bunch of ranges, which can happen if the memory has ended up being very, very fragmentated despite the efforts to compact it (depends on what the application ends up doing).
Attachment:
signature.asc
Description: OpenPGP digital signature