On Tue, Aug 27, 2024 at 04:07:57AM +0000, gaoxu wrote: > > > > -----邮件原件----- > > 发件人: Lokesh Gidra <lokeshgidra@xxxxxxxxxx> > > 发送时间: 2024年8月27日 8:12 > > 收件人: Barry Song <21cnbao@xxxxxxxxx> > > 抄送: Suren Baghdasaryan <surenb@xxxxxxxxxx>; Nicolas Geoffray > > <ngeoffray@xxxxxxxxxx>; Michal Hocko <mhocko@xxxxxxxx>; gaoxu > > <gaoxu2@xxxxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; > > linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Shaohua Li <shli@xxxxxx>; > > yipengxiang <yipengxiang@xxxxxxxxx>; fengbaopeng > > <fengbaopeng@xxxxxxxxx>; Kalesh Singh <kaleshsingh@xxxxxxxxxx> > > 主题: Re: [PATCH v2] mm: add lazyfree folio to lru tail > > > > On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@xxxxxxxxx> wrote: > > > > > > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@xxxxxxxxxx> > > wrote: > > > > > > > > Thanks Suren for looping in > > > > > > > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> > > wrote: > > > > > > > > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@xxxxxxxxx> > > wrote: > > > > > > > > > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko <mhocko@xxxxxxxx> > > wrote: > > > > > > > > > > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote: > > > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the > > lru_lazyfree_fn: > > > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If it's > > > > > > > > moved to the LRU tail, it allows for faster release lazy-free folio > > and > > > > > > > > reduces the impact on file refault. > > > > > > > > > > > > > > This has been discussed when MADV_FREE was introduced. The > > question was > > > > > > > whether this memory has a lower priority than other inactive memory > > that > > > > > > > has been marked that way longer ago. Also consider several > > MADV_FREE > > > > > > > users should they be LIFO from the reclaim POV? > > > > > > > > Thinking from the user's perspective, it seems to me that FIFO within > > > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a > > > > MADV_FREE'ed page hasn't been touched, the chances are higher that it > > > > may not be around anymore. > > > > > > > > > > > > Hi Lokesh, > > > Thanks! > > > > > > > > > The priority of this memory compared to other inactive memory that has > > been > > > > > > marked for a longer time likely depends on the user's expectations - How > > soon > > > > > > do users expect MADV_FREE to be reclaimed compared with old file > > folios. > > > > > > > > > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any > > > > > > useful performance data and reason in the changelog: > > > > > > https://android-review.googlesource.com/c/platform/art/+/2633132 > > > > > > > > > > > > Since art is the Android Java heap, it can be quite large. This increases the > > > > > > likelihood of packing the file LRU and reduces the chances of reclaiming > > > > > > anonymous memory, which could result in more file re-faults while > > helping > > > > > > anonymous folio persist longer in memory. > > > > > > > > Individual heaps of android apps are not big, and even in there we > > > > don't call MADV_FREE on the entire heap. > > > > > > How do you define "Individual heaps of android apps", do you know the usual > > > total_size for a phone with memory pressure by running multiple apps and > > how > > > much for each app? > > > > > Every app is a separate process and therefore has its own private ART > > heap. Those numbers that you are asking vary drastically. But here's > > what I can tell you: > > > > Max heap size for an app is 512MB typically. But it is rarely entirely > > used. Typical heap usage is 50MB to 250MB. But as I said, not all of > > it is MADV_FREE'ed. Only those pages which are freed after GC > > compaction are. > > > > > > > > > > > > I am really curious why art guys have moved to MADV_FREE if we have > > > > > > an approach to reach them. > > > > > > > > Honestly, it makes little sense as a user that calling MADV_FREE on an > > > > anonymous mapping will impact file LRU. That was never the intention > > > > with our ART change. > > > > > > > > > > This is just how MADV_FREE is implemented in the kernel, this kind of lazyfree > > > anon folios are moved to file but *NOT* anon LRU. > > > > > > > From our perspective, once a set of pages are MADV_FREE'ed, they are > > > > like a page-cache. It gives an opportunity, without hurting memory > > > > use, to avoid overhead of page-faults, which happen frequently after > > > > GC is done on running apps. > > > > > > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be > > > > prioritized for reclamation over file ones. > > > > > > This is exactly what this patch is doing, putting lazyfree anon folios > > > to the tail of file LRU so that they can be reclaimed earlier than file > > > folios. But the question is: is the requirement "MADV_FREE'ed pages > > > should be prioritized for reclamation over file ones" universally true for > > > all other non-Android users? > > > > > That's definitely an important question to get answered. But putting > > my users hat on again, by explicitly MADV_FREE'ing we ask for that > > behavior. IMHO, MADV_FREE'ed pages should be the first ones to be > > reclaimed on memory pressure. > For non-Android systems, perhaps the author of MADV_FREE can provide a more > reasonable opinion; > > Add Minchan Kim. > Please forgive me for forgetting to add you when sending the patch. AFAIR, there were two concerns: 1. The file LRU would contain pages used only once. While MADV_FREE allows discarding pages under memory pressure, the system would still have non-working set pages within the file LRU (e.g., those used only once). 2. LRU inversion among MADV_FREE users. Consider this time order: 1. A process: MADV_FREE 2. B process: MADV_FREE 3. C process: MADV_FREE The moving tail approach would discard the most recent pages from Process C first, instead of those from Process A. Of course, this isn't universally true for all workloads, but it's the reality. At the time, I proposed introducing an additional "ez_reclaimable" LRU list to store MADV_FREE pages (and potentially other hinted pages in the future). This would allow differentiating priority among LRU lists based on knobs or heuristics. However, this idea wasn't well-received.