Re: [PATCH] mm: hugetlb: independent PMD page table shared count

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14.12.24 11:44, Liu Shixin wrote:
The folio refcount may be increased unexpectly through try_get_folio() by
caller such as split_huge_pages. In huge_pmd_unshare(), we use refcount to
check whether a pmd page table is shared. The check is incorrect if the
refcount is increased by the above caller, and this can cause the page
table leaked:

Are you sure it is "leaked" ?

I assume what happens is that we end up freeing a page table without calling its constructor. That's why page freeing code complains about "nonzero mapcount" (overlayed by something else).

> >   BUG: Bad page state in process sh  pfn:109324
  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x66 pfn:0x109324
  flags: 0x17ffff800000000(node=0|zone=2|lastcpupid=0xfffff)
  page_type: f2(table)
  raw: 017ffff800000000 0000000000000000 0000000000000000 0000000000000000
  raw: 0000000000000066 0000000000000000 00000000f2000000 0000000000000000
  page dumped because: nonzero mapcount
  ...
  CPU: 31 UID: 0 PID: 7515 Comm: sh Kdump: loaded Tainted: G    B              6.13.0-rc2master+ #7
  Tainted: [B]=BAD_PAGE
  Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  Call trace:
   show_stack+0x20/0x38 (C)
   dump_stack_lvl+0x80/0xf8
   dump_stack+0x18/0x28
   bad_page+0x8c/0x130
   free_page_is_bad_report+0xa4/0xb0
   free_unref_page+0x3cc/0x620
   __folio_put+0xf4/0x158
   split_huge_pages_all+0x1e0/0x3e8
   split_huge_pages_write+0x25c/0x2d8
   full_proxy_write+0x64/0xd8
   vfs_write+0xcc/0x280
   ksys_write+0x70/0x110
   __arm64_sys_write+0x24/0x38
   invoke_syscall+0x50/0x120
   el0_svc_common.constprop.0+0xc8/0xf0
   do_el0_svc+0x24/0x38
   el0_svc+0x34/0x128
   el0t_64_sync_handler+0xc8/0xd0
   el0t_64_sync+0x190/0x198

The issue may be triggered by damon, offline_page, page_idle etc. which
will increase the refcount of page table.

Right, many do have a racy folio_test_lru() check in there, that prevents "most of the harm", but not all of them.


--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux