On Fri, Jan 12, 2024 at 09:49:08AM +0100, Michal Hocko wrote: > On Thu 11-01-24 16:08:57, Jianfeng Wang wrote: > > > > > > On 1/11/24 1:54 PM, Andrew Morton wrote: > > > On Thu, 11 Jan 2024 10:54:45 -0800 Jianfeng Wang <jianfeng.w.wang@xxxxxxxxxx> wrote: > > > > > >> > > >>> Unless you can show any actual runtime effect of this patch then I think > > >>> it shouldn't be merged. > > >>> > > >> > > >> Thanks for raising your concern. > > >> I'd call it a trade-off rather than "not really correct". Look at > > >> unmap_region() / free_pages_and_swap_cache() written by Linus. These are in > > >> favor of this pattern, which indicates that the trade-off (i.e. draining > > >> local CPU or draining all CPUs or no draining at all) had been made in the > > >> same way in the past. I don't have a specific runtime effect to provide, > > >> except that it will free 10s kB pages immediately during OOM. > > You are missing an important point. Those two calls are quite different. > oom_reaper unmaps memory after all the reclaim attempts have failed. > That includes draining all sorts of caches on the way. Including > draining LRU pcp cache (look for lru_add_drain_all in the reclaim path). > > > > I don't think it's necessary to run lru_add_drain() for each vma. Once > > > we've done it it once, it can be skipped for additional vmas. > > > > > Agreed. > > > > > That's pretty minor because the second and successive calls will be > > > cheap. But it becomes much more significant if we switch to > > > lru_add_drain_all(), which sounds like what we should be doing here. > > > Is it possible? > > > > > What do you both think of adding lru_add_drain_all() prior to the for loop? > > lru_add_drain_all relies on WQs. And we absolutely do not want to get > oom_reaper stuck just because all the WQ is jammed. So no, this is > actually actively harmful! I completely agree. The oom_reap_task_mm function is also used for process_mrelease, which is a critical path for releasing memory in Android and is typically used under system pressure(not only for memory pressure but also CPU pressured at the same time). The lru_add_drain_all function can take a long time to finish because Android is susceptible to priority inversion among processes. The better idea may enable remote draining with lru_add_drain_all, analogous to the recent PCP modifications.