Re: [PATCH v2 07/13] mm: Remove HUGETLB_PAGE_DTOR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 08, 2023 at 06:54:19PM +0100, Vlastimil Babka wrote:
> On 8/16/23 17:11, Matthew Wilcox (Oracle) wrote:
> > We can use a bit in page[1].flags to indicate that this folio belongs
> > to hugetlb instead of using a value in page[1].dtors.  That lets
> > folio_test_hugetlb() become an inline function like it should be.
> > We can also get rid of NULL_COMPOUND_DTOR.
> > 
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> 
> I think this (as commit 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR")) is
> causing the bug reported by Luis here:
> https://bugzilla.kernel.org/show_bug.cgi?id=218227

Luis, please stop using bugzilla.  If you'd sent email like a normal
kernel developer, I'd've seen this bug earlier.

> > page:000000009006bf10 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x3f8a0 pfn:0x1035c0
> > flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
> > page_type: 0xffffff7f(buddy)
> > raw: 017fffc000000000 ffffe704c422f808 ffffe704c41ac008 0000000000000000
> > raw: 000000000003f8a0 0000000000000005 00000000ffffff7f 0000000000000000
> > page dumped because: VM_BUG_ON_PAGE(n > 0 && !((__builtin_constant_p(PG_head) && __builtin_constant_p((uintptr_t)(&page->flags) != (uintptr_t)((vo>
> > ------------[ cut here ]------------
> > kernel BUG at include/linux/page-flags.h:314!
> > invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > CPU: 6 PID: 2435641 Comm: md5sum Not tainted 6.6.0-rc5 #2
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > RIP: 0010:folio_flags+0x65/0x70
> > Code: a8 40 74 de 48 8b 47 48 a8 01 74 d6 48 83 e8 01 48 39 c7 75 bd eb cb 48 8b 07 a8 40 75 c8 48 c7 c6 d8 a7 c3 89 e8 3b c7 fa ff <0f> 0b 66 0f >
> > RSP: 0018:ffffad51c0bfb7a8 EFLAGS: 00010246
> > RAX: 000000000000015f RBX: ffffe704c40d7000 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: ffffffff89be8040 RDI: 00000000ffffffff
> > RBP: 0000000000103600 R08: 0000000000000000 R09: ffffad51c0bfb658
> > R10: 0000000000000003 R11: ffffffff89eacb08 R12: 0000000000000035
> > R13: ffffe704c40d7000 R14: 0000000000000000 R15: ffffad51c0bfb930
> > FS:  00007f350c51b740(0000) GS:ffff9b62fbd80000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000555860919508 CR3: 00000001217fe002 CR4: 0000000000770ee0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > PKRU: 55555554
> > Call Trace:
> >  <TASK>
> >  ? die+0x32/0x80
> >  ? do_trap+0xd6/0x100
> >  ? folio_flags+0x65/0x70
> >  ? do_error_trap+0x6a/0x90
> >  ? folio_flags+0x65/0x70
> >  ? exc_invalid_op+0x4c/0x60
> >  ? folio_flags+0x65/0x70
> >  ? asm_exc_invalid_op+0x16/0x20
> >  ? folio_flags+0x65/0x70
> >  ? folio_flags+0x65/0x70
> >  PageHuge+0x67/0x80
> >  isolate_migratepages_block+0x1c5/0x13b0
> >  compact_zone+0x746/0xfc0
> >  compact_zone_order+0xbb/0x100
> >  try_to_compact_pages+0xf0/0x2f0
> >  __alloc_pages_direct_compact+0x78/0x210
> >  __alloc_pages_slowpath.constprop.0+0xac1/0xdb0
> >  __alloc_pages+0x218/0x240
> >  folio_alloc+0x17/0x50
> 
> It's because PageHuge() now does folio_test_hugetlb() which is documented to
> assume caller holds a reference, but the test in
> isolate_migratepages_block() doesn't. The code is there from 2021 by Oscar,
> perhaps it could be changed to take a reference (and e.g. only test
> PageHead() before), but it will be a bit involved as the
> isolate_or_dissolve_huge_page() it calls has some logic based on the
> refcount being zero/non-zero as well. Oscar, what do you think?
> Also I wonder if any of the the other existing PageHuge() callers are also
> affected because they might be doing so without a reference.

I don't think the warning is actually wrong!  We're living very
dangerously here as PageHuge() could have returned a false positive
before this change [1].  Then we assume that compound_nr() returns a
consistent result (and compound_order() might, for example, return a
value larger than 63, leading to UB).

I think there's a place for a folio_test_hugetlb_unsafe(), but that
would only remove the warning, and do nothing to fix all the unsafe
usage.  The hugetlb handling code in isolate_migratepages_block()
doesn't seem to have any understanding that it's working with pages
that can change under it.  I can have a go at fixing it up, but maybe
Oscar would prefer to do it?

[1] terribly unlikely, but consider what happens if the page starts out
as a non-hugetlb compound allocation.  Then it is freed and reallocated;
the new owner of the second page could have put enough information
into it that tricked PageHuge() into returning true.  Then we call
isolate_or_dissolve_huge_page() which takes a lock, but doesn't take
a refcount.  Now it gets a bogus hstate and things go downhill from
there ...)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux