Re: [BUG]: commit a1c34a3bf00a breaks an out-of-tree MIPS platform

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/17/16 10:39 PM, Joshua Kinard wrote:
Hi,

I recently discovered that commit a1c34a3bf00a (mm: Don't offset memmap for
flatmem) broke an out-of-tree MIPS platform, the IP30 (SGI Octane, a ~late
1990's graphics workstation).  Booting up, I get an "unhandled kernel unaligned
access" when registering one of the IP30-specific serial UART drivers (which
hangs off of the IOC3 PCI metadevice).

It seems that the specific hunk causing the is this one:
@@ -5452,9 +5455,9 @@ static void __init_refok alloc_node_mem_map(struct
pglist_data *pgdat)
  	 */
  	if (pgdat == NODE_DATA(0)) {
  		mem_map = NODE_DATA(0)->node_mem_map;
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+#if defined(CONFIG_HAVE_MEMBLOCK_NODE_MAP) || defined(CONFIG_FLATMEM)
  		if (page_to_pfn(mem_map) != pgdat->node_start_pfn)
-			mem_map -= (pgdat->node_start_pfn - ARCH_PFN_OFFSET);
+			mem_map -= offset;
  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
  	}
  #endif


I copied down the Oops message, which is:
[    2.460398] Unhandled kernel unaligned access[#1]:
[    2.460715] CPU: 1 PID: 14 Comm: kdevtmpfs Not tainted
4.4.0-mipsgit-20160110 #42
[    2.461079] task: a800000060181f00 ti: a800000060190000 task.ti:
a800000060190000
[    2.461437] $ 0   : 0000000000000000 0000000020009fe0 0000000000000000
0000000000000000
[    2.461914] $ 4   : 0000000020223e30 a800000060190000 0000000000000001
0000000000000000
[    2.462386] $ 8   : a80000006019fc40 0000000000000003 0000000000000001
a80000006044db80
[    2.462852] $12   : ffffffff9404dce0 000000001000001e 0000000000000000
ffffffffffffff80
[    2.463320] $16   : a80000006019faa0 ffffffffdc500000 0000000000000000
a800000020062664
[    2.463786] $20   : 28000000205a0400 a800000020062a2c 0000000000000000
a800000060023280
[    2.464251] $24   : a80000006044db40 0000000000000000
[    2.464716] $28   : a800000060190000 a80000006019fa50 0000000000000000
a800000020009fe0
[    2.465183] Hi    : fffffffff832db7f
[    2.465363] Lo    : 000000003f9f7ce5
[    2.465563] epc   : a800000020012ebc do_ade+0x57c/0x8b0
[    2.465823] ra    : a800000020009fe0 ret_from_exception+0x0/0x18
[    2.466107] Status: 9404dce2*KX SX UX KERNEL EXL
[    2.466446] Cause : 00000010 (ExcCode 04)
[    2.466642] BadVA : 28000000205a0400
[    2.466822] PrId  : 00000f24 (R14000)
[    2.467011] Process kdevtmpfs (pid: 14, threadinfo=a800000060190000,
task=a800000060181f00, tls=0000000000000000
[    2.467483] Stack : 0000000000000000 ffffffff00000000 a8000000205a03f8
a8000000205a0400
*  a80000006019fc40 a800000060018580 a800000020382310 0000000000000001
*  0000000000000000 a800000020009fe0 0000000000000000 ffffffff9404dce0
*  28000000205a0400 0000000000010000 a8000000205a03f8 0000000000000003
*  0000000000000001 0000000000000000 a80000006019fc40 0000000000000003
*  0000000000000001 a80000006044db80 ffffffffffff0000 000000000000007f
*  0000000000000000 ffffffffffffff80 a8000000205a03f8 a8000000205a0400
*  a80000006019fc40 a800000060018580 a800000020382310 0000000000000001
*  0000000000000000 a800000060023280 a80000006044db40 0000000000000000
*  ffffffffffff0000 000000000000007f a800000060190000 a80000006019fbd0
*  ...
[    2.540485] Call Trace:
[    2.551852] [<a800000020012ebc>] do_ade+0x57c/0x8b0
[    2.563244] [<a800000020009fe0>] ret_from_exception+0x0/0x18
[    2.574590] [<a800000020062660>] __wake_up_common+0x30/0xd0
[    2.586006] [<a800000020062a2c>] __wake_up+0x44/0x68
[    2.597483] [<a800000020063018>] __wake_up_bit+0x38/0x48
[    2.608856] [<a80000002010498c>] evict+0x10c/0x1a8
[    2.620112] [<a8000000200f89d8>] vfs_unlink+0x150/0x188
[    2.631473] [<a800000020203eb0>] handle_remove+0x1f0/0x358
[    2.642846] [<a800000020204468>] devtmpfsd+0x1c8/0x258
[    2.654123] [<a80000002004112c>] kthread+0x10c/0x128
[    2.665231] [<a80000002000a048>] ret_from_kernel_thread+0x14/0x1c
[    2.676326]
[    2.687315]
Code: 00431024  1440ff12  00000000 <6a960000> 6e960007  24020000  1440ff53
00000000  bfb40000
[    2.709740] ---[ end trace d8580deb2e1d1d4a ]---
[    2.721069] Fatal exception: panic in 5 seconds


The key problem is that BadVA specifies an address of 0x28000000205a0400, which
is inside the "unused" address space on 64-bit MIPS platforms.  That address
should really be 0xa8000000205a0400 (which is visible in the stack dump), so
something is getting mistranslated here it seems.

I'm not really sure what ARCH_PFN_OFFSET is used for, but for IP30 systems, it
seems important. Reverting both commit b0aeba741b2d (Fix alloc_node_mem_map()
to work on ia64 again) and this one allows IP30 systems to boot Linux-4.4.0.
However, I don't think that is the right fix, because it's too high up in
generic code, and the other SGI platforms (at least SGI IP32/O2 and SGI
IP27/Origin/Onyx) don't seem to be affected. I'm wondering if there's something
in the MIPS core code that I probably need to make use of, or probably a change
within IP30's platform code.

IP30 support in Linux does have some known issues with the current memory
setup, namely that all memory is currently being assigned to the "DMA" zone,
and nothing for "Normal" or "DMA32".  IP30 also has an oddity where System RAM
physically starts 512MB in on the address space.  I am not sure if either has
any bearing on this specific problem.

Any advice on fixing this properly would be appreciated.

Thanks!


What the patch was going for is to make sure that pfn 0 on the system
corresponds to mem_map[0] when pfn <-> page translation functions are
invoked. I'm guessing something is different on this system so the
code is not generating that properly which is giving the BadVA.

I suspect there is a disconnect between node_start_pfn and ARCH_PFN_OFFSET.
Can you share the platform code and .config you are using and also use
this debugging patch to get a bit more info?

Thanks,
Laura

-----8<-----

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9d666df..fb353c9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5307,6 +5307,10 @@ static void __init_refok alloc_node_mem_map(struct pglist_data *pgdat)
                        mem_map -= offset;
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
        }
+       pr_info(">>> ARCH_PFN_OFFSET %lx\n", ARCH_PFN_OFFSET);
+       pr_info(">>> pgdat->node_start_pfn %lx\n", pgdat->node_start_pfn);
+       pr_info(">>> offset %lx\n", offset);
+       pr_info(">>> first pfn %lx\n", page_to_pfn(mem_map));
 #endif
 #endif /* CONFIG_FLAT_NODE_MEM_MAP */
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]