Re: [PATCH v3 4/4] mm: Avoid splitting pmd for lazyfree pmd-mapped THP in try_to_unmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2025/1/15 11:38, Barry Song wrote:
From: Barry Song <v-songbaohua@xxxxxxxx>

The try_to_unmap_one() function currently handles PMD-mapped THPs
inefficiently. It first splits the PMD into PTEs, copies the dirty
state from the PMD to the PTEs, iterates over the PTEs to locate
the dirty state, and then marks the THP as swap-backed. This process
involves unnecessary PMD splitting and redundant iteration. Instead,
this functionality can be efficiently managed in
__discard_anon_folio_pmd_locked(), avoiding the extra steps and
improving performance.

The following microbenchmark redirties folios after invoking MADV_FREE,
then measures the time taken to perform memory reclamation (actually
set those folios swapbacked again) on the redirtied folios.

  #include <stdio.h>
  #include <sys/mman.h>
  #include <string.h>
  #include <time.h>

  #define SIZE 128*1024*1024  // 128 MB

  int main(int argc, char *argv[])
  {
  	while(1) {
  		volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
  				MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

  		memset((void *)p, 1, SIZE);
  		madvise((void *)p, SIZE, MADV_FREE);
  		/* redirty after MADV_FREE */
  		memset((void *)p, 1, SIZE);

		clock_t start_time = clock();
  		madvise((void *)p, SIZE, MADV_PAGEOUT);
  		clock_t end_time = clock();

  		double elapsed_time = (double)(end_time - start_time) / CLOCKS_PER_SEC;
  		printf("Time taken by reclamation: %f seconds\n", elapsed_time);

  		munmap((void *)p, SIZE);
  	}
  	return 0;
  }

Testing results are as below,
w/o patch:
~ # ./a.out
Time taken by reclamation: 0.007300 seconds
Time taken by reclamation: 0.007226 seconds
Time taken by reclamation: 0.007295 seconds
Time taken by reclamation: 0.007731 seconds
Time taken by reclamation: 0.007134 seconds
Time taken by reclamation: 0.007285 seconds
Time taken by reclamation: 0.007720 seconds
Time taken by reclamation: 0.007128 seconds
Time taken by reclamation: 0.007710 seconds
Time taken by reclamation: 0.007712 seconds
Time taken by reclamation: 0.007236 seconds
Time taken by reclamation: 0.007690 seconds
Time taken by reclamation: 0.007174 seconds
Time taken by reclamation: 0.007670 seconds
Time taken by reclamation: 0.007169 seconds
Time taken by reclamation: 0.007305 seconds
Time taken by reclamation: 0.007432 seconds
Time taken by reclamation: 0.007158 seconds
Time taken by reclamation: 0.007133 seconds
…

w/ patch

~ # ./a.out
Time taken by reclamation: 0.002124 seconds
Time taken by reclamation: 0.002116 seconds
Time taken by reclamation: 0.002150 seconds
Time taken by reclamation: 0.002261 seconds
Time taken by reclamation: 0.002137 seconds
Time taken by reclamation: 0.002173 seconds
Time taken by reclamation: 0.002063 seconds
Time taken by reclamation: 0.002088 seconds
Time taken by reclamation: 0.002169 seconds
Time taken by reclamation: 0.002124 seconds
Time taken by reclamation: 0.002111 seconds
Time taken by reclamation: 0.002224 seconds
Time taken by reclamation: 0.002297 seconds
Time taken by reclamation: 0.002260 seconds
Time taken by reclamation: 0.002246 seconds
Time taken by reclamation: 0.002272 seconds
Time taken by reclamation: 0.002277 seconds
Time taken by reclamation: 0.002462 seconds

Good result.

This patch significantly speeds up try_to_unmap_one() by allowing it
to skip redirtied THPs without splitting the PMD.

Suggested-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Suggested-by: Lance Yang <ioworker0@xxxxxxxxx>
Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>

LGTM.
Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux