On Fri, Mar 8, 2024 at 10:03 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 08.03.24 09:56, Barry Song wrote: > > From: Barry Song <v-songbaohua@xxxxxxxx> > > > > In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire > > large folio, resulting in the waste of (nr_pages - 1) pages. This wasted > > memory remains allocated until it is either unmapped or memory > > reclamation occurs. > > > > The following small program can serve as evidence of this behavior > > > > main() > > { > > #define SIZE 1024 * 1024 * 1024UL > > void *p = malloc(SIZE); > > memset(p, 0x11, SIZE); > > if (fork() == 0) > > _exit(0); > > memset(p, 0x12, SIZE); > > printf("done\n"); > > while(1); > > } > > > > For example, using a 1024KiB mTHP by: > > echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled > > > > (1) w/o the patch, it takes 2GiB, > > > > Before running the test program, > > / # free -m > > total used free shared buff/cache available > > Mem: 5754 84 5692 0 17 5669 > > Swap: 0 0 0 > > > > / # /a.out & > > / # done > > > > After running the test program, > > / # free -m > > total used free shared buff/cache available > > Mem: 5754 2149 3627 0 19 3605 > > Swap: 0 0 0 > > > > (2) w/ the patch, it takes 1GiB only, > > > > Before running the test program, > > / # free -m > > total used free shared buff/cache available > > Mem: 5754 89 5687 0 17 5664 > > Swap: 0 0 0 > > > > / # /a.out & > > / # done > > > > After running the test program, > > / # free -m > > total used free shared buff/cache available > > Mem: 5754 1122 4655 0 17 4632 > > Swap: 0 0 0 > > > > This patch migrates the last subpage to a small folio and immediately > > returns the large folio to the system. It benefits both memory availability > > and anti-fragmentation. > > > > Cc: David Hildenbrand <david@xxxxxxxxxx> > > Cc: Ryan Roberts <ryan.roberts@xxxxxxx> > > Cc: Lance Yang <ioworker0@xxxxxxxxx> > > Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > > --- > > mm/memory.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index e17669d4f72f..0200bfc15f94 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -3523,6 +3523,14 @@ static bool wp_can_reuse_anon_folio(struct folio *folio, > > folio_unlock(folio); > > return false; > > } > > + /* > > + * If the last subpage reuses the entire large folio, it would > > + * result in a waste of (nr_pages - 1) pages > > + */ > > + if (folio_ref_count(folio) == 1 && folio_test_large(folio)) { > > + folio_unlock(folio); > > + return false; > > + } > > /* > > * Ok, we've got the only folio reference from our mapping > > * and the folio is locked, it's dark out, and we're wearing > > > Why not simply: > > diff --git a/mm/memory.c b/mm/memory.c > index e17669d4f72f7..46d286bd450c6 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3498,6 +3498,10 @@ static vm_fault_t wp_page_shared(struct vm_fault > *vmf, struct folio *folio) > static bool wp_can_reuse_anon_folio(struct folio *folio, > struct vm_area_struct *vma) > { > + > + if (folio_test_large(folio)) > + return false; > + > /* > * We have to verify under folio lock: these early checks are > * just an optimization to avoid locking the folio and freeing > > We could only possibly succeed if we are the last one mapping a PTE > either way. No we simply give up right away for the time being. nice ! > > -- > Cheers, > > David / dhildenb >