On Wed, Dec 19, 2012 at 06:37:45PM -0800, H. Peter Anvin wrote: > On 12/19/2012 04:29 PM, Jacob Shin wrote: > > On Wed, Dec 19, 2012 at 04:24:09PM -0800, H. Peter Anvin wrote: > >> On 12/19/2012 04:07 PM, Jacob Shin wrote: > >>> > >>> From what I remember, accessing memory around the memory hole (not > >>> just the HT hole, but e038000000 ~ 10000000000 on our mentioned system > >>> ) generated prefetches because the memory hole was marked as WB in PAT. > >>> > >>> I'll take a look at the system again, try the blanket MTRR covering > >>> 0xe000000000 ~ 1TB, and talk to our BIOS guys. > >>> > >> > >> Yes, but do they all #MC (as opposed to, say, fetching all FFs)? > > > > Yes, MCE every time and it was fatal. > > > > OK, one more question... there is something odd with the memory ranges here: > > BIOS-e820: [mem 0x0000000100000000-0x000000e037ffffff] usable > BIOS-e820: [mem 0x000000e038000000-0x000000fcffffffff] reserved > BIOS-e820: [mem 0x0000010000000000-0x0000011ffeffffff] usable > > The first usable range here is 4G to 896G + 896M which is an awfully > strange number. Similarly, the second range is 1T to 1T + 128G - 16M. > The little fiddly bits imply that there is either overshoot of some sort > going on -- possibly reserved memory -- or these are fairly arbitrary > sizes that don't match any physical bank sizes in which case it should > be possible to shuffle it differently... Not exactly sure why the wierd boundaries, I'll have to ask the BIOS side folks to be sure. But if I were to guess .. Here is the NUMA spew out, physically there is 128 GB connected to each memory controller node. The PCI MMIO region starts at 0xc8000000. 4 GB - 0xc8000000 = 0x3800000 (896 MB). So we loose 896 MB due to PCI MMIO hole, so the first node ends at 128 GB + 896 MB to talk to all of 128 GB off of the first memory controller, and hence the weird 896 MB offset. [ 0.000000] SRAT: Node 0 PXM 0 0-a0000 [ 0.000000] SRAT: Node 0 PXM 0 100000-c8000000 [ 0.000000] SRAT: Node 0 PXM 0 100000000-2038000000 [ 0.000000] SRAT: Node 1 PXM 1 2038000000-4038000000 [ 0.000000] SRAT: Node 2 PXM 2 4038000000-6038000000 [ 0.000000] SRAT: Node 3 PXM 3 6038000000-8038000000 [ 0.000000] SRAT: Node 4 PXM 4 8038000000-a038000000 [ 0.000000] SRAT: Node 5 PXM 5 a038000000-c038000000 [ 0.000000] SRAT: Node 6 PXM 6 c038000000-e038000000 [ 0.000000] SRAT: Node 7 PXM 7 10000000000-11fff000000 [ 0.000000] NUMA: Initialized distance table, cnt=8 [ 0.000000] NUMA: Node 0 [0,a0000) + [100000,c8000000) -> [0,c8000000) [ 0.000000] NUMA: Node 0 [0,c8000000) + [100000000,2038000000) -> [0,2038000000) [ 0.000000] Initmem setup node 0 0000000000000000-0000002038000000 [ 0.000000] NODE_DATA [0000002037ff5000 - 0000002037ffffff] [ 0.000000] Initmem setup node 1 0000002038000000-0000004038000000 [ 0.000000] NODE_DATA [0000004037ff5000 - 0000004037ffffff] [ 0.000000] Initmem setup node 2 0000004038000000-0000006038000000 [ 0.000000] NODE_DATA [0000006037ff5000 - 0000006037ffffff] [ 0.000000] Initmem setup node 3 0000006038000000-0000008038000000 [ 0.000000] NODE_DATA [0000008037ff5000 - 0000008037ffffff] [ 0.000000] Initmem setup node 4 0000008038000000-000000a038000000 [ 0.000000] NODE_DATA [000000a037ff5000 - 000000a037ffffff] [ 0.000000] Initmem setup node 5 000000a038000000-000000c038000000 [ 0.000000] NODE_DATA [000000c037ff5000 - 000000c037ffffff] [ 0.000000] Initmem setup node 6 000000c038000000-000000e038000000 [ 0.000000] NODE_DATA [000000e037ff2000 - 000000e037ffcfff] [ 0.000000] Initmem setup node 7 0000010000000000-0000011fff000000 [ 0.000000] NODE_DATA [0000011ffeff1000 - 0000011ffeffbfff] [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0x00000010 -> 0x00001000 [ 0.000000] DMA32 0x00001000 -> 0x00100000 [ 0.000000] Normal 0x00100000 -> 0x11fff000 [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[10] active PFN ranges [ 0.000000] 0: 0x00000010 -> 0x00000099 [ 0.000000] 0: 0x00000100 -> 0x000c7ec0 [ 0.000000] 0: 0x00100000 -> 0x02038000 [ 0.000000] 1: 0x02038000 -> 0x04038000 [ 0.000000] 2: 0x04038000 -> 0x06038000 [ 0.000000] 3: 0x06038000 -> 0x08038000 [ 0.000000] 4: 0x08038000 -> 0x0a038000 [ 0.000000] 5: 0x0a038000 -> 0x0c038000 [ 0.000000] 6: 0x0c038000 -> 0x0e038000 [ 0.000000] 7: 0x10000000 -> 0x11fff000 [ 0.000000] On node 0 totalpages: 33553993 [ 0.000000] DMA zone: 56 pages used for memmap [ 0.000000] DMA zone: 5 pages reserved [ 0.000000] DMA zone: 3916 pages, LIFO batch:0 [ 0.000000] DMA32 zone: 14280 pages used for memmap [ 0.000000] DMA32 zone: 800504 pages, LIFO batch:31 [ 0.000000] Normal zone: 447552 pages used for memmap [ 0.000000] Normal zone: 32287680 pages, LIFO batch:31 [ 0.000000] On node 1 totalpages: 33554432 [ 0.000000] Normal zone: 458752 pages used for memmap [ 0.000000] Normal zone: 33095680 pages, LIFO batch:31 [ 0.000000] On node 2 totalpages: 33554432 [ 0.000000] Normal zone: 458752 pages used for memmap [ 0.000000] Normal zone: 33095680 pages, LIFO batch:31 [ 0.000000] On node 3 totalpages: 33554432 [ 0.000000] Normal zone: 458752 pages used for memmap [ 0.000000] Normal zone: 33095680 pages, LIFO batch:31 [ 0.000000] On node 4 totalpages: 33554432 [ 0.000000] Normal zone: 458752 pages used for memmap [ 0.000000] Normal zone: 33095680 pages, LIFO batch:31 [ 0.000000] On node 5 totalpages: 33554432 [ 0.000000] Normal zone: 458752 pages used for memmap [ 0.000000] Normal zone: 33095680 pages, LIFO batch:31 [ 0.000000] On node 6 totalpages: 33554432 [ 0.000000] Normal zone: 458752 pages used for memmap [ 0.000000] Normal zone: 33095680 pages, LIFO batch:31 [ 0.000000] On node 7 totalpages: 33550336 [ 0.000000] Normal zone: 458696 pages used for memmap [ 0.000000] Normal zone: 33091640 pages, LIFO batch:31 > > -hpa > > -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html