Re: [PATCH] mm: fix possible OOB in numa_rebuild_large_mapping()

Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> · Wed, 12 Jun 2024 10:41:27 +0800

Hi David and Baolin,

On 2024/6/7 18:37, David Hildenbrand wrote:
On 07.06.24 12:32, Kefeng Wang wrote:
The large folio is mapped with folio size aligned virtual address during
the pagefault, eg, 'addr = ALIGN_DOWN(vmf->address, nr_pages * 
PAGE_SIZE)'
in do_anonymous_page(), but after the mremap(), the virtual address only
require PAGE_SIZE aligned, also pte is moved to new in 
move_page_tables(),
then traverse the new pte in numa_rebuild_large_mapping() will hint the
following issue,

    Unable to handle kernel paging request at virtual address 
00000a80c021a788
    Mem abort info:
      ESR = 0x0000000096000004
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      FSC = 0x04: level 0 translation fault
    Data abort info:
      ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
      CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    user pgtable: 4k pages, 48-bit VAs, pgdp=00002040341a6000
    [00000a80c021a788] pgd=0000000000000000, p4d=0000000000000000
    Internal error: Oops: 0000000096000004 [#1] SMP
    ...
    CPU: 76 PID: 15187 Comm: git Kdump: loaded Tainted: G        
W          6.10.0-rc2+ #209
    Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 1.79 08/21/2021
    pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : numa_rebuild_large_mapping+0x338/0x638
    lr : numa_rebuild_large_mapping+0x320/0x638
    sp : ffff8000b41c3b00
    x29: ffff8000b41c3b30 x28: ffff8000812a0000 x27: 00000000000a8000
    x26: 00000000000000a8 x25: 0010000000000001 x24: ffff20401c7170f0
    x23: 0000ffff33a1e000 x22: 0000ffff33a76000 x21: ffff20400869eca0
    x20: 0000ffff33976000 x19: 00000000000000a8 x18: ffffffffffffffff
    x17: 0000000000000000 x16: 0000000000000020 x15: ffff8000b41c36a8
    x14: 0000000000000000 x13: 205d373831353154 x12: 5b5d333331363732
    x11: 000000000011ff78 x10: 000000000011ff10 x9 : ffff800080273f30
    x8 : 000000320400869e x7 : c0000000ffffd87f x6 : 00000000001e6ba8
    x5 : ffff206f3fb5af88 x4 : 0000000000000000 x3 : 0000000000000000
    x2 : 0000000000000000 x1 : fffffdffc0000000 x0 : 00000a80c021a780
    Call trace:
     numa_rebuild_large_mapping+0x338/0x638
     do_numa_page+0x3e4/0x4e0
     handle_pte_fault+0x1bc/0x238
     __handle_mm_fault+0x20c/0x400
     handle_mm_fault+0xa8/0x288
     do_page_fault+0x124/0x498
     do_translation_fault+0x54/0x80
     do_mem_abort+0x4c/0xa8
     el0_da+0x40/0x110
     el0t_64_sync_handler+0xe4/0x158
     el0t_64_sync+0x188/0x190

Do you have an easy reproducer that we can use to reproduce+verify this 
issue? The description above indicates to me that this should not be too 
complicated to write :)

Sorry for the late due to traditional Chinese festival, the issue is
easily reproduced when enable mTHP but drop the "align larger anonymous
mappings on THP boundaries" on arm64, here is the step,

  drop the align anon mapping on THP[Optional, but not very easy to 
reproduce]
  cd /sys/kernel/mm/transparent_hugepage/
  echo never  > hugepage-2048kB/never
  echo always > hugepage-1024kB/never  (other size could reproduce 
issue too)
  git clone root@127.0.0.1:/home/git/linux  (clone the local kernel repo)

and in numa_rebuild_large_mapping(), we hint some different errors, but
most are the bad access when

  if (pfn_folio(pte_pfn(ptent)) != folio)

The page is invalid, so guess the ptent/start_ptep is wrong when
traverse.