hi, Usama
On 2024/8/22 3:04, Usama Arif wrote:
On 20/08/2024 17:30, Barry Song wrote:
Hi Usama,
thanks! I can't judge if we need this partially_mapped flag. but if we
need, the code
looks correct to me. I'd like to leave this to David and other experts to ack.
Thanks for the reviews!
an alternative approach might be two lists? one for entirely_mapped,
the other one
for split_deferred. also seems ugly ?
That was my very first prototype! I shifted to using a bool which I sent in v1, and then a bit in _flags_1 as David suggested. I believe a bit in _flags_1 is the best way forward, as it leaves the most space in folio for future work.
On the other hand, when we want to extend your patchset to mTHP other than PMD-
order, will the only deferred_list create huge lock contention while
adding or removing
folios from it?
Yes, I would imagine so. the deferred_split_queue is per memcg/node, so that helps.
Also, this work is tied to khugepaged. So would need some thought when doing it for mTHP.
I would imagine doing underused shrinker for mTHP would be less beneficial compared to doing it for 2M THP. But probably needs experimentation.
Thanks
Below is the core code snippet to support "split underused mTHP". Can we extend the
khugepaged_max_ptes_none value to mthp and keep its semantics unchanged? With a small
modification, Only folios with page numbers greater than khugepaged_max_ptes_none - 1
can be added to the deferred_split list and can be split. What do you think?
diff --git a/mm/memory.c b/mm/memory.c
index b95fce7d190f..ef503958d6a0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4789,6 +4789,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
}
folio_ref_add(folio, nr_pages - 1);
+ if (nr_pages > 1 && nr_pages > khugepaged_max_ptes_none - 1)
+ deferred_split_folio(folio, false);
add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages);
count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC);
folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
shmem THP has the same memory expansion problem when the shmem_enabled configuration is
set to always. In my opinion, it is necessary to support "split underused shmem THP",
but I am not sure if there is any gap in the implementation?
Bang
Thanks