Re: A crash on ARM64 in move_freepages_block due to uninitialized pages in reserved memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 21 Aug 2018, James Morse wrote:

> Hi guys,
> 
> On 08/21/2018 11:44 AM, Michal Hocko wrote:
> > On Fri 17-08-18 15:44:27, Mikulas Patocka wrote:
> > > I report this crash on ARM64 on the kernel 4.17.11. The reason is that the
> > > function move_freepages_block accesses contiguous runs of
> > > pageblock_nr_pages. The ARM64 firmware sets holes of reserved memory there
> > > and when move_freepages_block stumbles over this hole, it accesses
> > > uninitialized page structures and crashes.
> 
> Any idea if this is nomap (so a hole in the linear map), or a missing struct
> page?

The page for this hole seems to be filled with 0xff.

> > > 00000000-03ffffff : System RAM
> > >    00080000-007bffff : Kernel code
> > >    00820000-00aa3fff : Kernel data
> > > 04200000-bf80ffff : System RAM
> > > bf810000-bfbeffff : reserved
> > > bfbf0000-bfc8ffff : System RAM
> > > bfc90000-bffdffff : reserved
> > > bffe0000-bfffffff : System RAM
> > > c0000000-dfffffff : MEM
> > >    c0000000-c00fffff : PCI Bus 0000:01
> > >      c0000000-c0003fff : 0000:01:00.0
> > >        c0000000-c0003fff : nvme
> To test Laura's bounds-of-zone theory [0], could you put some empty space
> between the nvme and the System RAM? (It sounds like this is a KVM guest).
> Reducing the amount of memory is probably easiest.

This is not KVM - it is real hardware with real PCIe nvme device. I don't 
have smaller memory stick.

The board can use u-boot firmware or EFI firmware. The u-boot firmware 
doesn't put a hole in the memory map and the board has been running with 
it for several months without a problem.

The EFI firmware puts a hole below 0xc0000000 and I got a crash after two 
weeks of uptime.

> > > The bug was already reported here for x86:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1598462
> > > 
> > > For x86, it was fixed in the kernel 4.17.7 - but I observed it in the
> > > kernel 4.17.11 on ARM64. I also observed it on 4.18-rc kernels running in
> > > KVM virtual machine on ARM when I compiled the guest kernel with 64kB page
> > > size.
> 
> I'm not sure this is the same bug.
> 
> [1] reports hitting a VM_BUG, this is a dereference of -ENOENT:

This crash is not from -ENOENT. It crashes because page->compound_head is 
0xffffffffffffffff (see below).

If I enable CONFIG_DEBUG_VM, I also get VM_BUG.

> > > Unable to handle kernel paging request at virtual address fffffffffffffffe
> 
> Does your kernel have HOLES_IN_ZONE enabled? (It looks like it depends on
> NUMA)

No.

> Could you reproduce this with CONIG_DEBUG_VM enabled?

I reproduced it in KVM with 64k pages and I enabled CONIG_DEBUG_VM, see 
below. (the bug could be triggerd more quickly in KVM).

> move_freepages() uses pfn_valid_within(), so it should handle missing struct
> pages in this range.
> 
> 
> > > CPU: 3 PID: 14823 Comm: updatedb.mlocat Not tainted 4.17.11 #16
> > > Hardware name: Marvell Armada 8040 MacchiatoBin/Armada 8040 MacchiatoBin,
> > > BIOS EDK II Jul 30 2018
> > > pstate: 00000085 (nzcv daIf -PAN -UAO)
> > > pc : move_freepages_block+0xb4/0x160
> > > lr : steal_suitable_fallback+0xe4/0x188
> 
> Any chance you could addr2line these?

I analyzed the assembler:
PageBuddy in move_freepages returns false
Then we call PageLRU, the macro calls PF_HEAD which is compound_page()
compound_page reads page->compound_head, it is 0xffffffffffffffff, so it 
resturns 0xfffffffffffffffe - and accessing this address causes crash

> > > Call trace:
> > >   move_freepages_block+0xb4/0x160
> > >   get_page_from_freelist+0xad8/0xea8
> > >   __alloc_pages_nodemask+0xac/0x970
> > >   new_slab+0xc0/0x348
> > >   ___slab_alloc.constprop.32+0x2cc/0x350
> > >   __slab_alloc.isra.26.constprop.31+0x24/0x38
> > >   kmem_cache_alloc+0x168/0x198
> > >   spadfs_alloc_inode+0x2c/0x88
> > >   alloc_inode+0x20/0xa0
> > >   iget5_locked+0xf8/0x1c0
> 
> > >   spadfs_iget+0x44/0x4c8
> > >   spadfs_lookup+0x70/0x108
> 
> Hmmm. What's this?

http://artax.karlin.mff.cuni.cz/~mikulas/spadfs/download/

> Thanks,
> 
> James
> 
> 
> [0] https://www.spinics.net/lists/linux-mm/msg157223.html
> [1] https://www.spinics.net/lists/linux-mm/msg156764.html

The same crash in KVM. The guest kernel has 64k pages. I enabled 
CONFIG_DEBUG_VM:

[ 1493.526129] page:fffffdff802e1780 is uninitialized and poisoned
[ 1493.526136] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
[ 1493.528030] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
[ 1493.529320] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
[ 1493.530441] ------------[ cut here ]------------
[ 1493.531301] kernel BUG at include/linux/mm.h:978!
[ 1493.532176] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 1493.533196] Modules linked in: raid0 raid10 dm_delay xfs reiserfs loop dm_crypt dm_zero dm_integrity raid1 dm_raid raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx md_mod dm_thin_pool dm_cache_smq dm_cache dm_persistent_data dm_bio_prison libcrc32c dm_mirror dm_region_hash dm_log dm_snapshot dm_bufio dm_mod ipv6 autofs4 binfmt_misc nls_utf8 nls_cp852 vfat fat af_packet aes_ce_blk crypto_simd cryptd aes_ce_cipher crc32_ce crct10dif_ce ghash_ce gf128mul aes_arm64 sha2_ce sha256_arm64 sha1_ce sha1_generic efivars virtio_net virtio_rng net_failover rng_core failover virtio_console ext4 crc32c_generic crc16 mbcache jbd2 virtio_scsi sd_mod scsi_mod virtio_blk virtio_mmio virtio_pci virtio_ring virtio [last unloaded: brd]
[ 1493.545466] CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7
[ 1493.546540] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 1493.547833] pstate: 40000085 (nZcv daIf -PAN -UAO)
[ 1493.548749] pc : move_freepages_block+0x144/0x248
[ 1493.549647] lr : move_freepages_block+0x144/0x248
[ 1493.550539] sp : fffffe0071177680
[ 1493.551176] x29: fffffe0071177680 x28: fffffc000861f3f8
[ 1493.552184] x27: 0000000000000048 x26: fffffc0008492000
[ 1493.553197] x25: fffffe007117771c x24: 000000000007ffc0
[ 1493.554203] x23: fffffc000861ef80 x22: fffffdff802fffc0
[ 1493.555209] x21: 0000000000000020 x20: fffffdff80280000
[ 1493.556220] x19: fffffdff802e1780 x18: 0000000000000000
[ 1493.557227] x17: 000003ff88424b08 x16: fffffc0008182c9c
[ 1493.558232] x15: 000000000000000a x14: 0720072007200720
[ 1493.559239] x13: 0720072007200720 x12: 0720072007200720
[ 1493.560249] x11: 0720072907290770 x10: 072807640765076e
[ 1493.561256] x9 : 076f07730769076f x8 : 0000000000000000
[ 1493.562261] x7 : 0750072807450747 x6 : 0000000000000007
[ 1493.563270] x5 : fffffe00bff30750 x4 : 0000000000000001
[ 1493.564276] x3 : 0000000000000007 x2 : 0000000000000007
[ 1493.565283] x1 : fffffe006260cd00 x0 : 0000000000000034
[ 1493.566297] Process dd (pid: 25236, stack limit = 0x0000000094cc07fb)
[ 1493.567506] Call trace:
[ 1493.567985]  move_freepages_block+0x144/0x248
[ 1493.568812]  steal_suitable_fallback+0x100/0x16c
[ 1493.569694]  get_page_from_freelist+0x440/0xb20
[ 1493.570554]  __alloc_pages_nodemask+0xe8/0x838
[ 1493.571401]  new_slab+0xd4/0x418
[ 1493.572022]  ___slab_alloc.constprop.27+0x380/0x4a8
[ 1493.572952]  __slab_alloc.isra.21.constprop.26+0x24/0x34
[ 1493.573955]  kmem_cache_alloc+0xa8/0x180
[ 1493.574704]  alloc_buffer_head+0x1c/0x90
[ 1493.575452]  alloc_page_buffers+0x68/0xb0
[ 1493.576222]  create_empty_buffers+0x20/0x1ec
[ 1493.577033]  create_page_buffers+0xb0/0xf0
[ 1493.577815]  __block_write_begin_int+0xc4/0x564
[ 1493.578676]  __block_write_begin+0x10/0x18
[ 1493.579457]  block_write_begin+0x48/0xd0
[ 1493.580212]  blkdev_write_begin+0x28/0x30
[ 1493.580977]  generic_perform_write+0x98/0x16c
[ 1493.581807]  __generic_file_write_iter+0x138/0x168
[ 1493.582715]  blkdev_write_iter+0x80/0xf0
[ 1493.583470]  __vfs_write+0xe4/0x10c
[ 1493.584138]  vfs_write+0xb4/0x168
[ 1493.584775]  ksys_write+0x44/0x88
[ 1493.585412]  sys_write+0xc/0x14
[ 1493.586018]  el0_svc_naked+0x30/0x34
[ 1493.586708] Code: aa1303e0 90001a01 91296421 94008902 (d4210000)
[ 1493.587857] ---[ end trace 1601ba47f6e883fe ]---
[ 1493.588780] note: dd[25236] exited with preempt_count 1

memory map for the KVM guest:

09000000-09000fff : pl011@9000000
  09000000-09000fff : pl011@9000000
09030000-09030fff : pl061@9030000
10000000-3efeffff : pcie@10000000
  10000000-101fffff : PCI Bus 0000:01
    10000000-1003ffff : 0000:01:00.0
    10040000-10040fff : 0000:01:00.0
  10200000-103fffff : PCI Bus 0000:02
  10400000-105fffff : PCI Bus 0000:03
    10400000-10400fff : 0000:03:00.0
  10600000-107fffff : PCI Bus 0000:04
  10800000-109fffff : PCI Bus 0000:05
    10800000-10800fff : 0000:05:00.0
3f000000-3fffffff : PCI ECAM
40000000-f85dffff : System RAM
  40080000-4057ffff : Kernel code
  405d0000-408effff : Kernel data
f85e0000-f86bffff : reserved
f86c0000-f86dffff : System RAM
f86e0000-f874ffff : reserved
f8750000-fbc1ffff : System RAM
fbc20000-fbffffff : reserved
fc000000-ffffffff : System RAM
8000000000-ffffffffff : pcie@10000000
  8000000000-80001fffff : PCI Bus 0000:01
    8000000000-8000003fff : 0000:01:00.0
      8000000000-8000003fff : virtio-pci-modern
  8000200000-80003fffff : PCI Bus 0000:02
  8000400000-80005fffff : PCI Bus 0000:03
    8000400000-8000403fff : 0000:03:00.0
      8000400000-8000403fff : virtio-pci-modern
  8000600000-80007fffff : PCI Bus 0000:04
    8000600000-8000603fff : 0000:04:00.0
      8000600000-8000603fff : virtio-pci-modern
  8000800000-80009fffff : PCI Bus 0000:05
    8000800000-8000803fff : 0000:05:00.0
      8000800000-8000803fff : virtio-pci-modern

Mikulas




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux