Re: kernel BUG at include/linux/mm.h:1020!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2019-03-15 at 16:58 -0400, Daniel Jordan wrote:
> On Tue, Mar 12, 2019 at 10:55:27PM +0500, Mikhail Gavrilov wrote:
> > Hi folks.
> > I am observed kernel panic after updated to git commit 610cd4eadec4.
> > I am did not make git bisect because this crashes occurs spontaneously
> > and I not have exactly instruction how reproduce it.
> > 
> > Hope backtrace below could help understand how fix it:
> > 
> > page:ffffef46607ce000 is uninitialized and poisoned
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > ------------[ cut here ]------------
> > kernel BUG at include/linux/mm.h:1020!
> > invalid opcode: 0000 [#1] SMP NOPTI
> > CPU: 1 PID: 118 Comm: kswapd0 Tainted: G         C
> > 5.1.0-0.rc0.git4.1.fc31.x86_64 #1
> > Hardware name: System manufacturer System Product Name/ROG STRIX
> > X470-I GAMING, BIOS 1201 12/07/2018
> > RIP: 0010:__reset_isolation_pfn+0x244/0x2b0
> 
> This is new code, from e332f741a8dd1 ("mm, compaction: be selective about what
> pageblocks to clear skip hints"), so I added some folks.
> 
> Can you show
> $LINUX/scripts/faddr2line path/to/vmlinux __reset_isolation_pfn+0x244
> ?

Yes, looks like another instance of page flag corruption. I have been chasing
this thing for a while.

https://lore.kernel.org/linux-mm/604a92ae-cbbb-7c34-f9aa-f7c08925bedf@xxxxxx/

Basically, linux-next is easier to reproduce than the mainline.

LTP oom* tests and stress-ng has been useful to reproduce so far.

# stress-ng --sequential 64 --class vm −−aggressive -t 60 --times

I did manage to reproduce the memory corruption in arm64 on the mainline too
(originally only x64). Still that BUG_ON(!PageBuddy(page)).

[51720.012258] kernel BUG at mm/page_alloc.c:3124!
[51720.040287] CPU: 194 PID: 1311 Comm: kcompactd1 Kdump: loaded Tainted:
G        W    L    5.0.0+ #13
[51720.049411] Hardware name: HPE Apollo 70             /C01_APACHE_MB         ,
BIOS L50_5.13_1.0.6 07/10/2018
[51720.059232] pstate: 90400089 (NzcV daIf +PAN -UAO)
[51720.064038] pc : __isolate_free_page+0x7bc/0x804
[51720.068659] lr : compaction_alloc+0x948/0x2490
[51720.073094] sp : edff8009836576c0
[51720.076400] x29: edff800983657740 x28: efff100000000000 
[51720.081705] x27: ffff80977c3b8f10 x26: 0000000000000009 
[51720.087010] x25: ffff80977c3b90b8 x24: ffff80977c3b8f20 
[51720.092314] x23: 0000000000000800 x22: ffff80977c3b8f40 
[51720.097619] x21: 00000000000000ff x20: 00000000000000ff 
[51720.102923] x19: ffff80977c3b8f10 x18: efff100000000000 
[51720.108227] x17: ffff1000115c02b8 x16: 0000000000918000 
[51720.113532] x15: 0000000000912000 x14: efff100000000000 
[51720.118838] x13: 00000000000000ff x12: 00000000000000ff 
[51720.124141] x11: 00000000000000ff x10: 00000000000000ff 
[51720.129447] x9 : 00000000f0000000 x8 : 0000000070000000 
[51720.134753] x7 : 0000000000000000 x6 : ffff1000105f5620 
[51720.140058] x5 : 0000000000000000 x4 : 0000000000000080 
[51720.145364] x3 : ffff80977c3b90c0 x2 : 0000000000000000 
[51720.150669] x1 : 0000000000000009 x0 : ffff1000132fe200 
[51720.155976] Process kcompactd1 (pid: 1311, stack limit = 0x00000000c41b1162)
[51720.163015] Call trace:
[51720.165457]  __isolate_free_page+0x7bc/0x804
[51720.169721]  compaction_alloc+0x948/0x2490
[51720.173821]  unmap_and_move+0xdc/0x1dbc
[51720.177649]  migrate_pages+0x274/0x1310
[51720.181476]  compact_zone+0x26f8/0x43c8
[51720.185304]  kcompactd+0x15b8/0x1a24
[51720.188874]  kthread+0x374/0x390
[51720.192100]  ret_from_fork+0x10/0x18
[51720.195669] Code: 94176b90 17fffebb d0016e20 91080000 (d4210000)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux