On 2024/10/28 21:46, David Hildenbrand wrote:
On 28.10.24 14:33, Kefeng Wang wrote:
On 2024/10/28 21:14, David Hildenbrand wrote:
On 28.10.24 13:52, Kefeng Wang wrote:
On 2024/10/28 18:00, David Hildenbrand wrote:
On 26.10.24 07:43, Kefeng Wang wrote:
When clearing gigantic page, it zeros page from the first page to the
last page, if directly passing addr_hint which maybe not the address
of the first page of folio, then some archs could flush the wrong
cache
if it does use the addr_hint as a hint. For non-gigantic page, it
calculates the base address inside, even passed the wrong
addr_hint, it
only has performance impact as the process_huge_page() wants to
process
target page last to keep its cache lines hot), no functional impact.
Let's pass the real accessed address to folio_zero_user() and use the
aligned address in clear_gigantic_page() to fix it.
Fixes: 78fefd04c123 ("mm: memory: convert clear_huge_page() to
folio_zero_user()")
Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
v2:
- update changelog to clarify the impact, per Andrew
fs/hugetlbfs/inode.c | 2 +-
mm/memory.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index a4441fb77f7c..a5ea006f403e 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -825,7 +825,7 @@ static long hugetlbfs_fallocate(struct file
*file,
int mode, loff_t offset,
error = PTR_ERR(folio);
goto out;
}
- folio_zero_user(folio, ALIGN_DOWN(addr, hpage_size));
+ folio_zero_user(folio, addr);
__folio_mark_uptodate(folio);
error = hugetlb_add_to_page_cache(folio, mapping, index);
if (unlikely(error)) {
diff --git a/mm/memory.c b/mm/memory.c
index 75c2dfd04f72..ef47b7ea5ddd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6821,6 +6821,7 @@ static void clear_gigantic_page(struct folio
*folio, unsigned long addr,
int i;
might_sleep();
+ addr = ALIGN_DOWN(addr, folio_size(folio));
Right, that's what's effectively done in a very bad way in
process_huge_page()
unsigned long addr = addr_hint &
~(((unsigned long)nr_pages << PAGE_SHIFT) - 1);
That should all be cleaned up ... process_huge_page() likely shouldn't
Yes, let's fix the bug firstly,
be even consuming "nr_pages".
No sure about this part, it uses nr_pages as the end and calculate the
'base'.
It should be using folio_nr_pages().
But process_huge_page() without an explicit folio argument, I'd like to
move the aligned address calculate into the folio_zero_user and
copy_user_large_folio(will rename it to folio_copy_user()) in the
following cleanup patches, or do it in the fix patches?
First, why does folio_zero_user() call process_huge_page() for *a small
folio*? Because we like or code to be extra complicated to understand?
Or am I missing something important?
The folio_zero_user() used for PMD-sized THP and HugeTLB before, and
after anon mTHP supported, it is used for order-2~order-PMD-order THP
and HugeTLB, so it won't process a small folio if I understand correctly.
Second, we should be passing the folio to "process_huge_page" and likely
rename it to "folio_process_pages()" or sth like that. The function even
documents "of the specified huge page", but there is none specified. The
copy case might require a rework.
I think this code needs a serious cleanup ...
Yes, I'd like to do more cleanup and rework them, also I find some
performance issue on my arm64 machine[1] which need to be addressed.
[1]
https://lore.kernel.org/linux-mm/2524689c-08f5-446c-8cb9-924f9db0ee3a@xxxxxxxxxx/