On Wed, Mar 04, 2020 at 10:58:02AM +0100, Michal Hocko wrote: > > >> If my understanding were correct, the newly migrated clean MADV_FREE > > >> pages will be put at the head of inactive file LRU list instead of the > > >> tail. So it's possible that some useful file cache pages will be > > >> reclaimed. > > > > > > This is the case also when you migrate other pages, right? We simply > > > cannot preserve the aging. > > > > So you consider the priority of the clean MADV_FREE pages is same as > > that of page cache pages? > > This is how MADV_FREE has been implemented, yes. See f7ad2a6cb9f7 ("mm: > move MADV_FREE pages into LRU_INACTIVE_FILE list") for the > justification. > > > Because the penalty difference is so large, I > > think it may be a good idea to always put clean MADV_FREE pages at the > > tail of the inactive file LRU list? > > You are again making assumptions without giving any actual real > examples. Reconstructing MADV_FREE pages cost can differ a lot. This > really depends on the specific usecase. Moving pages to the tail of LRU > would make them the primary candidate for the reclaim with a strange > LIFO semantic. Adding them to the head might be not the universal win > but it will at least provide a reasonable FIFO semantic. I also find > it much more easier to reason about MADV_FREE as an inactive cache. I tend to agree, that would make MADV_FREE behave more like a PageReclaim page that gets tagged for immediate reclaim when writeback completes. Immediate reclaim is in response to heavy memory pressure where there is trouble finding clean file pages to reclaim and dirty/writeback pages are getting artifically preserved over hot-but-clean file pages. That is a clear inversion of the order pages should be reclaimed and is justified. While there *might* be a basis for reclaiming MADV_FREE sooner rather than later, there would have to be some evidence of a Page inversion problem where a known hot page was getting reclaimed before MADV_FREE pages. For example, it could easily be considered a bug to free MADV_FREE pages over a page that was last touched at boot time. About the only real concern I could find about MADV_FREE is that it keeps RSS artifically high relative to MADV_DONTNEED in the absense of memory pressure. In some cases userspace provided a way of switching to MADV_DONTNEED at startup time to determine if there is a memory leak or just MADV_FREE keeping pages resident. They probably would have benefitted from a counter noting the number of MADV_FREE pages in the system as opposed to the vmstat event or some other way of distinguishing real RSS from MADV_FREE. However, I can't find a bug report indicating that MADV_FREE pages were pushing hot pages out to disk (be it file-backed or anonymous). -- Mel Gorman SUSE Labs