On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@xxxxxxxx> wrote: > > > > 在 2024/7/8 20:41, Barry Song 写道: > > > > > > zhiguojiang <justinjiang@xxxxxxxx> 于 2024年7月9日周二 00:25写道: > > > > > > > > 在 2024/7/8 20:17, zhiguojiang 写道: > > > > > > > > > 在 2024/7/8 19:02, Barry Song 写道: > > >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@xxxxxxxx> > > >> wrote: > > >>> The releasing process of the non-shared anonymous folio mapped > > >>> solely by > > >>> an exiting process may go through two flows: 1) the anonymous > > folio is > > >>> firstly is swaped-out into swapspace and transformed into a > > swp_entry > > >>> in shrink_folio_list; 2) then the swp_entry is released in the > > process > > >>> exiting flow. This will increase the cpu load of releasing a > > non-shared > > >>> anonymous folio mapped solely by an exiting process, because > > the folio > > >>> go through swap-out and the releasing the swapspace and swp_entry. > > >>> > > >>> When system is low memory, it is more likely to occur, because > > more > > >>> backend applidatuions will be killed. > > >>> > > >>> The modification is that shrink skips the non-shared anonymous > > folio > > >>> solely mapped by an exting process and the folio is only released > > >>> directly in the process exiting flow, which will save swap-out > > time > > >>> and alleviate the load of the process exiting. > > >>> > > >>> Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx> > > >>> --- > > >>> > > >>> Change log: > > >>> v4->v5: > > >>> 1.Modify to skip non-shared anonymous folio only. > > >>> 2.Update comments for pra->referenced = -1. > > >>> v3->v4: > > >>> 1.Modify that the unshared folios mapped only in exiting task > > are skip. > > >>> v2->v3: > > >>> Nothing. > > >>> v1->v2: > > >>> 1.The VM_EXITING added in v1 patch is removed, because it will > > fail > > >>> to compile in 32-bit system. > > >>> > > >>> mm/rmap.c | 13 +++++++++++++ > > >>> mm/vmscan.c | 7 ++++++- > > >>> 2 files changed, 19 insertions(+), 1 deletion(-) > > >>> > > >>> diff --git a/mm/rmap.c b/mm/rmap.c > > >>> index 26806b49a86f..5b5281d71dbb > > >>> --- a/mm/rmap.c > > >>> +++ b/mm/rmap.c > > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct > > folio > > >>> *folio, > > >>> int referenced = 0; > > >>> unsigned long start = address, ptes = 0; > > >>> > > >>> + /* > > >>> + * Skip the non-shared anonymous folio mapped solely by > > >>> + * the single exiting process, and release it directly > > >>> + * in the process exiting. > > >>> + */ > > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || > > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > > >>> + folio_test_anon(folio) && > > >>> folio_test_swapbacked(folio) && > > >>> + !folio_likely_mapped_shared(folio)) { > > >>> + pra->referenced = -1; > > >>> + return false; > > >>> + } > > >>> + > > >>> while (page_vma_mapped_walk(&pvmw)) { > > >>> address = pvmw.address; > > > Sure, I agree with your modification suggestions. This way, > > using PTL > > > indeed sure > > > that the folio is mapped by this process. > > > Thanks > > >> As David suggested, what about the below? > > >> > > >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio > > >> *folio, > > >> continue; > > >> } > > >> > > >> + /* > > >> + * Skip the non-shared anonymous folio mapped > > solely by > > >> + * the single exiting process, and release it > > directly > > >> + * in the process exiting. > > >> + */ > > >> + if ((!atomic_read(&vma->vm_mm->mm_users) || > > >> + test_bit(MMF_OOM_SKIP, > > >> &vma->vm_mm->flags)) && > > >> + folio_test_anon(folio) && > > >> folio_test_swapbacked(folio) && > > >> + !folio_likely_mapped_shared(folio)) { > > >> + pra->referenced = -1; > > >> + page_vma_mapped_walk_done(&pvmw); > > >> + return false; > > >> + } > > >> + > > >> if (pvmw.pte) { > > >> if (lru_gen_enabled() && > > >> pte_young(ptep_get(pvmw.pte))) { > > >> > > >> > > >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, > > >> &vma->vm_mm->flags) is > > >> correct (I think it is wrong). For example, global_init can > > >> directly have it: > > >> if (is_global_init(p)) { > > >> can_oom_reap = false; > > >> set_bit(MMF_OOM_SKIP, &mm->flags); > > >> pr_info("oom killer %d (%s) has mm > > pinned by > > >> %d (%s)\n", > > >> task_pid_nr(victim), > > >> victim->comm, > > >> task_pid_nr(p), p->comm); > > >> continue; > > >> } > > >> > > >> And exit_mmap() automatically has MMF_OOM_SKIP. > > >> > > >> What is the purpose of this check? Is there a better way to > > determine > > >> if a process is an > > >> OOM target? What about check_stable_address_space() ? > > > 1.Sorry, I overlook the situation with if (is_global_init(p)), > > > MMF_OOM_SKIP is indeed not suitable. > > > > > > 2.check_stable_address_space() can indicate oom_reaper, but it > > seems > > > unable to identify the situation where the process exits normally. > > > What about task_is_dying()? static inline bool > > task_is_dying(void) { > > > return tsk_is_oom_victim(current) || > > fatal_signal_pending(current) || > > > (current->flags & PF_EXITING); } Thanks > > We can migrate task_is_dying() from mm/memcontrol.c to > > include/linux/oom.h > > > static inline bool task_is_dying(void) > > > { > > > return tsk_is_oom_victim(current) || > > fatal_signal_pending(current) || > > > (current->flags & PF_EXITING); > > > } > > > > > > no. current is kswapd. > Hi Barry, > > It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP. > check_stable_address_space() can indicate oom kill, and > !atomic_read(&vma->vm_mm->mm_users) > can indicate the normal process exiting. > > /* > * Skip the non-shared anonymous folio mapped solely by > * the single exiting process, and release it directly > * in the process exiting. > */ > if ((!atomic_read(&vma->vm_mm->mm_users) || > check_stable_address_space(vma->vm_mm)) && > folio_test_anon(folio) && folio_test_swapbacked(folio) && > !folio_likely_mapped_shared(folio)) { > pra->referenced = -1; > page_vma_mapped_walk_done(&pvmw); > return false; > } > Yes, + David, Willy (when you send a new version, please CC people who have participated and describe how you have addressed comments from those people.) I also think we actually can remove "folio_test_anon(folio)". So It could be, @@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio, continue; } + /* + * Skip the non-shared swapbacked folio mapped solely by + * the exiting or OOM-reaped process. This avoids redundant + * swap-out followed by an immediate unmap. + */ + if ((!atomic_read(&vma->vm_mm->mm_users) || + check_stable_address_space(vma->vm_mm)) && + folio_test_swapbacked(folio) && + !folio_likely_mapped_shared(folio)) { + pra->referenced = -1; + page_vma_mapped_walk_done(&pvmw); + return false; + } + if (pvmw.pte) { if (lru_gen_enabled() && pte_young(ptep_get(pvmw.pte))) { > Thanks > Zhiguo > > > > > > >> > > >> > > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > > >>> index 0761f91b407f..bae7a8bf6b3d > > >>> --- a/mm/vmscan.c > > >>> +++ b/mm/vmscan.c > > >>> @@ -863,7 +863,12 @@ static enum folio_references > > >>> folio_check_references(struct folio *folio, > > >>> if (vm_flags & VM_LOCKED) > > >>> return FOLIOREF_ACTIVATE; > > >>> > > >>> - /* rmap lock contention: rotate */ > > >>> + /* > > >>> + * There are two cases to consider. > > >>> + * 1) Rmap lock contention: rotate. > > >>> + * 2) Skip the non-shared anonymous folio mapped solely by > > >>> + * the single exiting process. > > >>> + */ > > >>> if (referenced_ptes == -1) > > >>> return FOLIOREF_KEEP; > > >>> > > >>> -- > > >>> 2.39.0 > > >>> > > >> Thanks > > >> Barry > > > > > > Thanks Barry