Re: [PATCH v3 1/5] mm: memory_hotplug: remove head variable in do_migrate_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/9/30 17:25, David Hildenbrand wrote:
On 29.09.24 04:19, Miaohe Lin wrote:
On 2024/9/29 10:04, Kefeng Wang wrote:


On 2024/9/29 9:16, Miaohe Lin wrote:
On 2024/9/28 16:39, David Hildenbrand wrote:
On 28.09.24 10:34, David Hildenbrand wrote:
On 28.09.24 06:55, Matthew Wilcox wrote:
On Tue, Aug 27, 2024 at 07:47:24PM +0800, Kefeng Wang wrote:
Directly use a folio for HugeTLB and THP when calculate the next pfn, then
remove unused head variable.

I just noticed this got merged.  You're going to hit BUG_ON with it.

-        if (PageHuge(page)) {
-            pfn = page_to_pfn(head) + compound_nr(head) - 1;
-            isolate_hugetlb(folio, &source);
-            continue;
-        } else if (PageTransHuge(page))
-            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
+        /*
+         * No reference or lock is held on the folio, so it might
+         * be modified concurrently (e.g. split).  As such,
+         * folio_nr_pages() may read garbage.  This is fine as the outer
+         * loop will revisit the split folio later.
+         */
+        if (folio_test_large(folio)) {

But it's not fine.  Look at the implementation of folio_test_large():

static inline bool folio_test_large(const struct folio *folio)
{
            return folio_test_head(folio);
}

That's going to be provided by:

#define FOLIO_TEST_FLAG(name, page)                                     \ static __always_inline bool folio_test_##name(const struct folio *folio) \
{ return test_bit(PG_##name, const_folio_flags(folio, page)); }

and here's the BUG:

static const unsigned long *const_folio_flags(const struct folio *folio,
                    unsigned n)
{
            const struct page *page = &folio->page;

            VM_BUG_ON_PGFLAGS(PageTail(page), page);
            VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page- >flags), page);
            return &page[n].flags;
}

(this page can be transformed from a head page to a tail page because,
as the comment notes, we don't hold a reference.

Please back this out.

Should we generalize the approach in dump_folio() to locally copy a
folio, so we can safely perform checks before deciding whether we want to try grabbing a reference on the real folio (if it's still a folio :) )?


Oh, and I forgot: isn't the existing code already racy?

PageTransHuge() -> VM_BUG_ON_PAGE(PageTail(page), page);

Yes, in v1[1], I asked same question for existing code for PageTransHuge(page),

   "If the page is a tail page, we will BUG_ON(DEBUG_VM enabled) here,
    but it seems that we don't guarantee the page won't be a tail page."


...

do_migrate_range is called after start_isolate_page_range(). So a page might not be able to
transform from a head page to a tail page as it's isolated?
start_isolate_page_range() is only isolate free pages, so maybe irrelevant.

A page transform from a head page to a tail page should through the below steps:
1. The compound page is freed into buddy.
2. It's merged into larger order in buddy.
3. It's allocated as a larger order compound page.

Since it is isolated, I think step 2 or 3 cannot happen. Or am I miss something?

By isolated, you mean that the pageblock is isolated, and all free pages are in the MIGRATE_ISOLATE buddy list. Nice observation.

Indeed, a tail page could become a head page (concurrent split is possible), but a head page should not become a tail for the reason you mention.

Even mm/page_reporting.c will skip isolated pageblocks.

I wonder if there are some corner cases, but nothing comes to mind that would perform compound allocations from the MIGRATE_ISOLATE list.


As David/Miaohe explanation, it seems that we can't hint the VM_BUG_ON,
so no more changes, thanks for all the review and comments.







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux