The releasing process of the non-shared anonymous folio mapped solely by an exiting process may go through two flows: 1) the anonymous folio is firstly is swaped-out into swapspace and transformed into a swp_entry in shrink_folio_list; 2) then the swp_entry is released in the process exiting flow. This will result in the high cpu load of releasing a non-shared anonymous folio mapped solely by an exiting process. When the low system memory and the exiting process exist at the same time, it will be likely to happen, because the non-shared anonymous folio mapped solely by an exiting process may be reclaimed by shrink_folio_list. This patch is that shrink skips the non-shared anonymous folio solely mapped by an exting process and this folio is only released directly in the process exiting flow, which will save swap-out time and alleviate the load of the process exiting. Acked-by: Barry Song <baohua@xxxxxxxxxx> Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx> --- Change log: v9->v10: 1.Add folio_test_anon(folio) condition. v8->v9: 1.Update Reviewed-by tag information. v7->v8: 1.Add tags of Reviewed-by and Acked-by. 2.Add #include <linux/oom.h> to solve compilation issue. v6->v7: 1.Modify tab indentation to space indentation of the continuation lines of the condition. v5->v6: 1.Move folio_likely_mapped_shared() under the PTL. 2.Add check_stable_address_space() to replace MMF_OOM_SKIP. 3.Remove folio_test_anon(folio). v4->v5: 1.Further modify to skip non-shared anonymous folio only. 2.Update comments for pra->referenced = -1. v3->v4: 1.Modify to skip only the non-shared anonymous folio mapped solely by an exiting process. v2->v3: Nothing. v1->v2: 1.The VM_EXITING added in v1 patch is removed, because it will fail to compile in 32-bit system. Comments from participants and my responses: [v9->v10]: 1.David Hildenbrand <david@xxxxxxxxxx> && Barry Song <baohua@xxxxxxxxxx> BTW, we dropped the folio_test_anon() check, but what about shmem? They also do __folio_set_swapbacked()? Even though anon_shmem behaves similarly to anonymous memory when releasing memory, it doesn't seem worth the added complexity? --> Take folio_test_anon() back. [v8->v9]: 1.Barry Song <baohua@xxxxxxxxxx> No, this is a disaster. Please ask someone for help before you send it. Neither Willy nor David has ever posted any Reviewed-by tags. Please do get someone to help you. Stop posting like this! --> Update Reviewed-by tag information in v9. [v7->v8]: 1.Barry Song <baohua@xxxxxxxxxx> You should have collected tags such as reviewed-by, acked-by you got in v6 while sending v7. --> Added in patch v8. You didn't even pass the compilation stage because you're missing 'linux/oom.h'. It's quite disappointing because I believe in your idea, but you didn't even build it before sending. --> Sorry, I overlooked the compilation of folio_likely_mapped_shared() added in patch v5. Compiled and Updated have been compeleted in patch v8. [v6->v7]: 1.Matthew Wilcox <willy@xxxxxxxxxxxxx> You told me you'd fix the indentation. You cannot indent both the continuation lines of the condition and the body of the if by one tab each! --> Modify tab indentation to space indentation of the continuation lines of the condition. [v5->v6]: 1.David Hildenbrand <david@xxxxxxxxxx> I'm currently working on moving all folio_likely_mapped_shared() under the PTL, where we are then sure that the folio is actually mapped by this process (e.g., no concurrent unmapping poisslbe). Can we do the same here directly? --> You are right. we might use page_vma_mapped_walk_done() to bail out. (Barry Song) 2.Barry Song <baohua@xxxxxxxxxx> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags) is correct (I think it is wrong). And exit_mmap() automatically has MMF_OOM_SKIP. What is the purpose of this check? Is there a better way to determine if a process is an OOM target? What about check_stable_address_space() ? --> Sorry, I overlook the situation with if (is_global_init(p)), MMF_OOM_SKIP is indeed not suitable. It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP. check_stable_address_space() can indicate oom kill, and !atomic_read(&vma->vm_mm->mm_users) can indicate the normal process exiting. I also think we actually can remove "folio_test_anon(folio)". --> Yes, update in patch v6. [v4->v5]: 1.Barry Song <baohua@xxxxxxxxxx> I don't think this is correct. folio_likely_mapped_shared() is almost "correct" but not always. Please explain why you set pra->referenced = -1. Please address all comments before you send a new version. --> Update in patch v5. 2.Matthew Wilcox <willy@xxxxxxxxxxxxx> How is the file folio similar? File folios are never written to swap, and they'll be written back from the page cache whenever the filesystem decides it's a good time to do so. --> What do you mean is that the file folio will not have any relevant identifier left in memory after it is reclamed in the shrink flow, and it will not be released again during an exiting process? If that's the case, I think we only need the anon folio is skipped here. [v3->v4]: 1.Barry Song <baohua@xxxxxxxxxx> This is clearly version 3, as you previously sent version 2, correct? --> Yes. Could you please describe the specific impact on users, including user experience and power consumption? How serious is this problem? --> At present, I do not have a suitable method to accurately measure the optimization benefit datas of this modifications, but I believe it theoretically has some benefits. Launching large memory app (for example, starting the camera) in multiple backend scenes may result in the high cpu load of the exiting processes. Applications? --> Yes, when system is low memory, it more likely to occur. I'm not completely convinced this patch is correct, but it appears to be heading in the right direction. Therefore, I expect to see new versions rather than it being dead. You changed the file mode to 755, which is incorrect. --> Solved. Why use -1? Is this meant to simulate lock contention to keep the folio without activating it? Please do have some comments to explain why. I'm not convinced this change is appropriate for shared folios. It seems more suitable for exclusive folios used solely by the exiting process. --> The skiped folios are FOLIOREF_KEEP and added into inactive lru, beacase the folios will be freed soon in the exiting process flow. Yes, the shared folios can not be simply skipped. I have made relevant modifications in patch v4 and please help to further review. https://lore.kernel.org/linux-mm/20240708031517.856-1-justinjiang@xxxxxxxx/ 2.David Hildenbrand <david@xxxxxxxxxx> but what if it is shared among multiple processes and only one of them is exiting? --> Modify to skip only the non-shared anonymous folio mapped solely by an exiting process in next version v4. [v2->v3:] Nothing. [v1->v2]: 1.Matthew Wilcox <willy@xxxxxxxxxxxxx> What testing have you done of this patch? How often does it happen? Are there particular workloads that benefit from this? (I'm not sure what "mutil backed-applications" are?) And I do mean specifically of this patch, because to my eyes it shouldn't even compile. Except on 32-bit where it'll say "warning: integer constant out of range". --> Yes, I have tested. When the low system memory and the exiting process exist at the same time, it will happen. This modification can alleviate the load of the exiting process. "mutil backed-applications" means that there are a large number of the backend applications in the system. The VM_EXITING added in v1 patch is removed, because it will fail to compile in 32-bit system. mm/rmap.c | 15 +++++++++++++++ mm/vmscan.c | 7 ++++++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index 86787df6e212..316a6bb9747b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -75,6 +75,7 @@ #include <linux/memremap.h> #include <linux/userfaultfd_k.h> #include <linux/mm_inline.h> +#include <linux/oom.h> #include <asm/tlbflush.h> @@ -870,6 +871,20 @@ static bool folio_referenced_one(struct folio *folio, continue; } + /* + * Skip the non-shared swapbacked folio mapped solely by + * the exiting or OOM-reaped process. This avoids redundant + * swap-out followed by an immediate unmap. + */ + if ((!atomic_read(&vma->vm_mm->mm_users) || + check_stable_address_space(vma->vm_mm)) && + folio_test_anon(folio) && folio_test_swapbacked(folio) && + !folio_likely_mapped_shared(folio)) { + pra->referenced = -1; + page_vma_mapped_walk_done(&pvmw); + return false; + } + if (pvmw.pte) { if (lru_gen_enabled() && pte_young(ptep_get(pvmw.pte))) { diff --git a/mm/vmscan.c b/mm/vmscan.c index 0761f91b407f..9afe4bb5ba87 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio, if (vm_flags & VM_LOCKED) return FOLIOREF_ACTIVATE; - /* rmap lock contention: rotate */ + /* + * There are two cases to consider. + * 1) Rmap lock contention: rotate. + * 2) Skip the non-shared swapbacked folio mapped solely by + * the exiting or OOM-reaped process. + */ if (referenced_ptes == -1) return FOLIOREF_KEEP; -- 2.39.0