On 2024/9/28 16:39, David Hildenbrand wrote: > On 28.09.24 10:34, David Hildenbrand wrote: >> On 28.09.24 06:55, Matthew Wilcox wrote: >>> On Tue, Aug 27, 2024 at 07:47:24PM +0800, Kefeng Wang wrote: >>>> Directly use a folio for HugeTLB and THP when calculate the next pfn, then >>>> remove unused head variable. >>> >>> I just noticed this got merged. You're going to hit BUG_ON with it. >>> >>>> - if (PageHuge(page)) { >>>> - pfn = page_to_pfn(head) + compound_nr(head) - 1; >>>> - isolate_hugetlb(folio, &source); >>>> - continue; >>>> - } else if (PageTransHuge(page)) >>>> - pfn = page_to_pfn(head) + thp_nr_pages(page) - 1; >>>> + /* >>>> + * No reference or lock is held on the folio, so it might >>>> + * be modified concurrently (e.g. split). As such, >>>> + * folio_nr_pages() may read garbage. This is fine as the outer >>>> + * loop will revisit the split folio later. >>>> + */ >>>> + if (folio_test_large(folio)) { >>> >>> But it's not fine. Look at the implementation of folio_test_large(): >>> >>> static inline bool folio_test_large(const struct folio *folio) >>> { >>> return folio_test_head(folio); >>> } >>> >>> That's going to be provided by: >>> >>> #define FOLIO_TEST_FLAG(name, page) \ >>> static __always_inline bool folio_test_##name(const struct folio *folio) \ >>> { return test_bit(PG_##name, const_folio_flags(folio, page)); } >>> >>> and here's the BUG: >>> >>> static const unsigned long *const_folio_flags(const struct folio *folio, >>> unsigned n) >>> { >>> const struct page *page = &folio->page; >>> >>> VM_BUG_ON_PGFLAGS(PageTail(page), page); >>> VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page); >>> return &page[n].flags; >>> } >>> >>> (this page can be transformed from a head page to a tail page because, >>> as the comment notes, we don't hold a reference. >>> >>> Please back this out. >> >> Should we generalize the approach in dump_folio() to locally copy a >> folio, so we can safely perform checks before deciding whether we want >> to try grabbing a reference on the real folio (if it's still a folio :) )? >> > > Oh, and I forgot: isn't the existing code already racy? > > PageTransHuge() -> VM_BUG_ON_PAGE(PageTail(page), page); do_migrate_range is called after start_isolate_page_range(). So a page might not be able to transform from a head page to a tail page as it's isolated? Thanks.