How about blocking khugepaged from collapsing lazyfree pages? This way, is it not better to keep the semantics of MADV_FREE? What do you think? Thanks, Lance On Fri, Feb 2, 2024 at 10:42 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Fri 02-02-24 21:46:45, Lance Yang wrote: > > Here is a part from the man page explaining > > the MADV_FREE semantics: > > > > The kernel can thus free thesepages, but the > > freeing could be delayed until memory pressure > > occurs. For each of the pages that has been > > marked to be freed but has not yet been freed, > > the free operation will be canceled if the caller > > writes into the page. If there is no subsequent > > write, the kernel can free the pages at any time. > > > > IIUC, if there is no subsequent write, lazyfree > > pages will eventually be reclaimed. > > If there is no memory pressure then this might not > ever happen. User cannot make any assumption about > their content once madvise call has been done. The > content has to be considered lost. Sure the userspace > might have means to tell those pages from zero pages > and recheck after the write but that is about it. > > > khugepaged > > treats lazyfree pages the same as pte_none, > > avoiding copying them to the new huge page > > during collapse. It seems that lazyfree pages > > are reclaimed before khugepaged collapses them. > > This aligns with user expectations. > > > > However, IMO, if the content of MADV_FREE pages > > remains valid during collapse, then khugepaged > > treating lazyfree pages the same as pte_none > > might not be suitable. > > Why? > > Unless I am missing something (which is possible of > course) I do not really see why dropping the content > of those pages and replacing them with a THP is any > difference from reclaiming those pages and then faulting > in a non-THP zero page. > > Now, if khugepaged reused the original content of MADV_FREE > pages that would be a slightly different story. I can > see why users would expect zero pages to back madvised > area. > -- > Michal Hocko > SUSE Labs