Re: [PATCH v2 1/2] mm: use aligned address in clear_gigantic_page()

Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> · Mon, 28 Oct 2024 22:22:07 +0800

On 2024/10/28 21:46, David Hildenbrand wrote:
On 28.10.24 14:33, Kefeng Wang wrote:


On 2024/10/28 21:14, David Hildenbrand wrote:
On 28.10.24 13:52, Kefeng Wang wrote:


On 2024/10/28 18:00, David Hildenbrand wrote:
On 26.10.24 07:43, Kefeng Wang wrote:
When clearing gigantic page, it zeros page from the first page to the
last page, if directly passing addr_hint which maybe not the address
of the first page of folio, then some archs could flush the wrong 
cache
if it does use the addr_hint as a hint. For non-gigantic page, it
calculates the base address inside, even passed the wrong 
addr_hint, it
only has performance impact as the process_huge_page() wants to 
process
target page last to keep its cache lines hot), no functional impact.

Let's pass the real accessed address to folio_zero_user() and use the
aligned address in clear_gigantic_page() to fix it.

Fixes: 78fefd04c123 ("mm: memory: convert clear_huge_page() to
folio_zero_user()")
Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
v2:
- update changelog to clarify the impact, per Andrew

    fs/hugetlbfs/inode.c | 2 +-
    mm/memory.c          | 1 +
    2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index a4441fb77f7c..a5ea006f403e 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -825,7 +825,7 @@ static long hugetlbfs_fallocate(struct file 
*file,
int mode, loff_t offset,
                error = PTR_ERR(folio);
                goto out;
            }
-        folio_zero_user(folio, ALIGN_DOWN(addr, hpage_size));
+        folio_zero_user(folio, addr);
            __folio_mark_uptodate(folio);
            error = hugetlb_add_to_page_cache(folio, mapping, index);
            if (unlikely(error)) {
diff --git a/mm/memory.c b/mm/memory.c
index 75c2dfd04f72..ef47b7ea5ddd 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6821,6 +6821,7 @@ static void clear_gigantic_page(struct folio
*folio, unsigned long addr,
        int i;
        might_sleep();
+    addr = ALIGN_DOWN(addr, folio_size(folio));

Right, that's what's effectively done in a very bad way in
process_huge_page()

unsigned long addr = addr_hint &
                ~(((unsigned long)nr_pages << PAGE_SHIFT) - 1);


That should all be cleaned up ... process_huge_page() likely shouldn't

Yes, let's fix the bug firstly,

be even consuming "nr_pages".

No sure about this part, it uses nr_pages as the end and calculate the
'base'.

It should be using folio_nr_pages().

But process_huge_page() without an explicit folio argument, I'd like to
move the aligned address calculate into the folio_zero_user and
copy_user_large_folio(will rename it to folio_copy_user()) in the
following cleanup patches, or do it in the fix patches?

First, why does folio_zero_user() call process_huge_page() for *a small 
folio*? Because we like or code to be extra complicated to understand? 
Or am I missing something important?

The folio_zero_user() used for PMD-sized THP and HugeTLB before, and
after anon mTHP supported, it is used for order-2~order-PMD-order THP
and HugeTLB, so it won't process a small folio if I understand correctly.


Second, we should be passing the folio to "process_huge_page" and likely 
rename it to "folio_process_pages()" or sth like that. The function even 
documents "of the specified huge page", but there is none specified. The 
copy case might require a rework.

I think this code needs a serious cleanup ...


Yes, I'd like to do more cleanup and rework them, also I find some
performance issue on my arm64 machine[1] which need to be addressed.

[1] 
https://lore.kernel.org/linux-mm/2524689c-08f5-446c-8cb9-924f9db0ee3a@xxxxxxxxxx/