On Fri, 2010-10-29 at 12:52 -0700, Tim Pepper wrote: > On Fri 29 Oct at 07:35:35 +0530 btharindu@xxxxxxxxx said: > > Finally I could isolate the issue further. > > I tried following kernels and hardware. > > Issue is visible only with IBM + SLES 11. > > > > 1. SLES 11 + IBM HW --> Issue is Visible > > 2. SLES 11 + HP, Sun HW --> Issue is not Visible > > 2. 2.6.32 Vanilla + Any HW --> Issue is not Visible > > 3. 2.6.36 Vanilla + Any HW --> Issue is not Visible > > It would be interesting to see the output of "numactl --hardware" for each > of these scenarios. > Also, if you could add "mminit_loglevel=2" to the boot command line, and grep for 'zonelist general'. The general zonelists for the Normal zones will show the order of allocation for the two nodes. On a 2 node [AMD] platform, I see: xxx(lts)dmesg | grep 'zonelist general' mminit::zonelist general 0:DMA = 0:DMA mminit::zonelist general 0:DMA32 = 0:DMA32 0:DMA mminit::zonelist general 0:Normal = 0:Normal 0:DMA32 0:DMA 1:Normal mminit::zonelist general 1:Normal = 1:Normal 0:Normal 0:DMA32 0:DMA so, node 0 Normal zone allocates from 0:Normal first, as expected, and than falls back via DMA32, DMA [both on node 0] eventually to node 1 Normal. Node 1 starts locally and falls back to Node 0 Normal and, finally, the DMA zones. You can also try: cat /proc/zoneinfo | egrep '^Node|^ pages|^ +present' and maybe "watch" that [watch(1)] while you run your tests. And, just to be sure, you could suspend your dd job [^Z] and take a look at it's mempolicy and such via /proc/<pid>/status [Mems_allowed*] and it's /proc/<pid>/numa_maps. If you haven't changed anything you should see both nodes in Mems_allowed[_list] and all of the policies in the numa_maps should show 'default'. Andi already mentioned zone_reclaim_mode. You'll want that set to '0' if you want allocations to overflow/fallback to off-node without attempting direct reclaim first. E.g., set vm.zone_reclaim_mode = 0 in your /etc/sysctl.conf and reload via 'sysctl -p' if you want it to stick. Regards, Lee -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html