On 31.08.24 11:23, Barry Song wrote:
From: Barry Song <v-songbaohua@xxxxxxxx> On a physical phone, it's sometimes observed that deferred_split mTHPs account for over 15% of the total mTHPs. Profiling by Chuanhua indicates that the majority of these originate from the typical fork scenario. When the child process either execs or exits, the parent process should ideally be able to reuse the entire mTHP. However, the current kernel lacks this capability and instead places the mTHP into split_deferred, performing a CoW (Copy-on-Write) on just a single subpage of the mTHP. main() { #define SIZE 1024 * 1024UL void *p = malloc(SIZE); memset(p, 0x11, SIZE); if (fork() == 0) exec(....); /* * this will trigger cow one subpage from * mTHP and put mTHP into split_deferred * list */ *(int *)(p + 10) = 10; printf("done\n"); while(1); } This leads to two significant issues: * Memory Waste: Before the mTHP is fully split by the shrinker, it wastes memory. In extreme cases, such as with a 64KB mTHP, the memory usage could be 64KB + 60KB until the last subpage is written, at which point the mTHP is freed. * Fragmentation and Performance Loss: It destroys large folios (negating the performance benefits of CONT-PTE) and fragments memory. To address this, we should aim to reuse the entire mTHP in such cases. Hi David, I’ve renamed wp_page_reuse() to wp_folio_reuse() and added an entirely_reuse argument because I’m not sure if there are still cases where we reuse a subpage within an mTHP. For now, I’m setting entirely_reuse to true only for the newly supported case, while all other cases still get false. Please let me know if this is incorrect—if we don’t reuse subpages at all, we could remove the argument.
See [1] I sent out this week, that is able to reuse even without scanning page tables. If we find the the folio is exclusive we could try processing surrounding PTEs that map the same folio.
[1] https://lkml.kernel.org/r/20240829165627.2256514-1-david@xxxxxxxxxx -- Cheers, David / dhildenb