>> I do wonder which purpose the deferred split serves nowadays at all. >> Fortunately, there is documentation: Documentation/vm/transhuge.rst: >> >> " >> Unmapping part of THP (with munmap() or other way) is not going to free >> memory immediately. Instead, we detect that a subpage of THP is not in >> use in page_remove_rmap() and queue the THP for splitting if memory >> pressure comes. Splitting will free up unused subpages. >> >> Splitting the page right away is not an option due to locking context in >> the place where we can detect partial unmap. It also might be >> counterproductive since in many cases partial unmap happens during >> exit(2) if a THP crosses a VMA boundary. >> >> The function deferred_split_huge_page() is used to queue a page for >> splitting. The splitting itself will happen when we get memory pressure >> via shrinker interface. >> " >> >> I do wonder which these locking contexts are exactly, and if we could >> also do the same thing on ordinary munmap -- because I assume it can be >> similarly problematic for some applications. > > This is a good question regarding munmap. One main difference is > munmap takes mmap_lock in write mode and usually performance critical > applications avoid such operations. Maybe we can extend it too most page zapping, if that makes things simpler. > >> The "exit()" case might >> indeed be interesting, but I really do wonder if this is even observable >> in actual number: I'm not so sure about the "many cases" but I might be >> wrong, of course. > > I am not worried about the exit(). The whole THP will get freed and be > removed from the deferred list as well. Note that deferred list does > not hold reference to the THP and has a hook in the THP destructor. Yes, you're right. We'll run into the de-constructor either way. -- Thanks, David / dhildenb