On Fri, 2019-03-15 at 16:58 -0400, Daniel Jordan wrote: > On Tue, Mar 12, 2019 at 10:55:27PM +0500, Mikhail Gavrilov wrote: > > Hi folks. > > I am observed kernel panic after updated to git commit 610cd4eadec4. > > I am did not make git bisect because this crashes occurs spontaneously > > and I not have exactly instruction how reproduce it. > > > > Hope backtrace below could help understand how fix it: > > > > page:ffffef46607ce000 is uninitialized and poisoned > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > > ------------[ cut here ]------------ > > kernel BUG at include/linux/mm.h:1020! > > invalid opcode: 0000 [#1] SMP NOPTI > > CPU: 1 PID: 118 Comm: kswapd0 Tainted: G C > > 5.1.0-0.rc0.git4.1.fc31.x86_64 #1 > > Hardware name: System manufacturer System Product Name/ROG STRIX > > X470-I GAMING, BIOS 1201 12/07/2018 > > RIP: 0010:__reset_isolation_pfn+0x244/0x2b0 > > This is new code, from e332f741a8dd1 ("mm, compaction: be selective about what > pageblocks to clear skip hints"), so I added some folks. > > Can you show > $LINUX/scripts/faddr2line path/to/vmlinux __reset_isolation_pfn+0x244 > ? Yes, looks like another instance of page flag corruption. I have been chasing this thing for a while. https://lore.kernel.org/linux-mm/604a92ae-cbbb-7c34-f9aa-f7c08925bedf@xxxxxx/ Basically, linux-next is easier to reproduce than the mainline. LTP oom* tests and stress-ng has been useful to reproduce so far. # stress-ng --sequential 64 --class vm −−aggressive -t 60 --times I did manage to reproduce the memory corruption in arm64 on the mainline too (originally only x64). Still that BUG_ON(!PageBuddy(page)). [51720.012258] kernel BUG at mm/page_alloc.c:3124! [51720.040287] CPU: 194 PID: 1311 Comm: kcompactd1 Kdump: loaded Tainted: G W L 5.0.0+ #13 [51720.049411] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.0.6 07/10/2018 [51720.059232] pstate: 90400089 (NzcV daIf +PAN -UAO) [51720.064038] pc : __isolate_free_page+0x7bc/0x804 [51720.068659] lr : compaction_alloc+0x948/0x2490 [51720.073094] sp : edff8009836576c0 [51720.076400] x29: edff800983657740 x28: efff100000000000 [51720.081705] x27: ffff80977c3b8f10 x26: 0000000000000009 [51720.087010] x25: ffff80977c3b90b8 x24: ffff80977c3b8f20 [51720.092314] x23: 0000000000000800 x22: ffff80977c3b8f40 [51720.097619] x21: 00000000000000ff x20: 00000000000000ff [51720.102923] x19: ffff80977c3b8f10 x18: efff100000000000 [51720.108227] x17: ffff1000115c02b8 x16: 0000000000918000 [51720.113532] x15: 0000000000912000 x14: efff100000000000 [51720.118838] x13: 00000000000000ff x12: 00000000000000ff [51720.124141] x11: 00000000000000ff x10: 00000000000000ff [51720.129447] x9 : 00000000f0000000 x8 : 0000000070000000 [51720.134753] x7 : 0000000000000000 x6 : ffff1000105f5620 [51720.140058] x5 : 0000000000000000 x4 : 0000000000000080 [51720.145364] x3 : ffff80977c3b90c0 x2 : 0000000000000000 [51720.150669] x1 : 0000000000000009 x0 : ffff1000132fe200 [51720.155976] Process kcompactd1 (pid: 1311, stack limit = 0x00000000c41b1162) [51720.163015] Call trace: [51720.165457] __isolate_free_page+0x7bc/0x804 [51720.169721] compaction_alloc+0x948/0x2490 [51720.173821] unmap_and_move+0xdc/0x1dbc [51720.177649] migrate_pages+0x274/0x1310 [51720.181476] compact_zone+0x26f8/0x43c8 [51720.185304] kcompactd+0x15b8/0x1a24 [51720.188874] kthread+0x374/0x390 [51720.192100] ret_from_fork+0x10/0x18 [51720.195669] Code: 94176b90 17fffebb d0016e20 91080000 (d4210000)