On Mon, Feb 26, 2024 at 01:57:39PM +1300, Barry Song wrote: > From: Barry Song <v-songbaohua@xxxxxxxx> > > While doing MADV_PAGEOUT, the current code will clear PTE young > so that vmscan won't read young flags to allow the reclamation > of madvised folios to go ahead. > It seems we can do it by directly ignoring references, thus we > can remove tlb flush in madvise and rmap overhead in vmscan. > > Regarding the side effect, in the original code, if a parallel > thread runs side by side to access the madvised memory with the > thread doing madvise, folios will get a chance to be re-activated > by vmscan (though the time gap is actually quite small since > checking PTEs is done immediately after clearing PTEs young). But > with this patch, they will still be reclaimed. But this behaviour > doing PAGEOUT and doing access at the same time is quite silly > like DoS. So probably, we don't need to care. Or ignoring the > new access during the quite small time gap is even better. > > For DAMON's DAMOS_PAGEOUT based on physical address region, we > still keep its behaviour as is since a physical address might > be mapped by multiple processes. MADV_PAGEOUT based on virtual > address is actually much more aggressive on reclamation. To > untouch paddr's DAMOS_PAGEOUT, we simply pass ignore_references > as false in reclaim_pages(). > > A microbench as below has shown 6% decrement on the latency of > MADV_PAGEOUT, > > #define PGSIZE 4096 > main() > { > int i; > #define SIZE 512*1024*1024 > volatile long *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > for (i = 0; i < SIZE/sizeof(long); i += PGSIZE / sizeof(long)) > p[i] = 0x11; > > madvise(p, SIZE, MADV_PAGEOUT); > } > > w/o patch w/ patch > root@10:~# time ./a.out root@10:~# time ./a.out > real 0m49.634s real 0m46.334s > user 0m0.637s user 0m0.648s > sys 0m47.434s sys 0m44.265s > > Cc: SeongJae Park <sj@xxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> Acked-by: Minchan Kim <minchan@xxxxxxxxxx>