On Wed, Jan 05, 2022 at 08:55:34AM +0000, SeongJae Park wrote: > Hi Yu, > > On Tue, 4 Jan 2022 13:22:19 -0700 Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > TLDR > > ==== > > The current page reclaim is too expensive in terms of CPU usage and it > > often makes poor choices about what to evict. This patchset offers an > > alternative solution that is performant, versatile and > > straightforward. > > > [...] > > Summery > > ======= > > The facts are: > > 1. The independent lab results and the real-world applications > > indicate substantial improvements; there are no known regressions. > > So impressive results! > > > 2. Thrashing prevention, working set estimation and proactive reclaim > > work out of the box; there are no equivalent solutions. > > I think similar works are already available out of the box with the latest > mainline tree, though it might be suboptimal in some cases. Ok, I will sound harsh because I hate it when people challenge facts while having no idea what they are talking about. Our jobs are help the leadership make best decisions by providing them with facts, not feeding them crap. Don't get me wrong -- you are welcome to start another thread and have a casual discussion with me. But this thread is not for that; it's for the leadership and stakeholder to make a decision. Check who are in "To" and "Cc" and what my request is. > I didn't read this patchset thoroughly yet, so I might missing many things. If > so, please feel free to let me know. Yes, apparently you didn't read this patchset thoroughly, and you have missed all things that matter to this thread. > First, you can do thrashing prevention using DAMON-based Operation Scheme > (DAMOS)[1] with MADV_COLD action. Here is thrashing prevention really means, from patch 8: +Personal computers +------------------ +:Thrashing prevention: Write ``N`` to + ``/sys/kernel/mm/lru_gen/min_ttl_ms`` to prevent the working set of + ``N`` milliseconds from getting evicted. The OOM killer is invoked if + this working set can't be kept in memory. Based on the average human + detectable lag (~100ms), ``N=1000`` usually eliminates intolerable + lags due to thrashing. Larger values like ``N=3000`` make lags less + noticeable at the cost of more OOM kills. It's about when to trigger OOM kills. Got it? Or probably you don't understand what MADV_COLD is either? > Second, for working set estimation, you can either use the DAMOS > again with statistics action, or the damon_aggregated tracepoint[2]. This is you are suggesting: TRACE_EVENT(damon_aggregated, TP_printk("target_id=%lu nr_regions=%u %lu-%lu: %u", __entry->target_id, __entry->nr_regions, __entry->start, __entry->end, __entry->nr_accesses) Now read my doc again: +Data centers +------------ +:Debugfs interface: ``/sys/kernel/debug/lru_gen`` has the following + format: + memcg memcg_id memcg_path + node node_id Have you heard of something called memcg? And NUMA node? How exactly can this tracepoint provide information about different memcgs and NUMA node? > The DAMON user space tool[3] helps the tracepoint analysis and > visualization. What does "work out of box" mean? Should every Linux desktop, laptop and phone user install this tool? > Finally, for the proactive reclaim, you can again use the DAMOS > with MADV_PAGEOUT action How exactly does MADV_PAGEOUT find pages that are NOT mapped in page tables? Let me tell you another fact: they are usually the cheapest to reclaim. > or simply the DAMON-based proactive reclaim module (DAMON_RECLAIM)[4]. > [4] https://docs.kernel.org/admin-guide/mm/damon/reclaim.html How many knob does DAMON_RECLAIM have? 14? I lost count. > Of course, the integration might not be so simple as seems to me now. Look, I'm open to your suggestion. I probably should have been nicer. So I'm sorry. I just don't appreciate alternative facts.