> On Wed, Nov 24, 2010 at 09:27:53AM +0000, Mel Gorman wrote: > > > On Tue, Nov 23, 2010 at 10:43:29PM -0800, Simon Kirby wrote: > > > On Tue, Nov 23, 2010 at 10:04:03AM +0000, Mel Gorman wrote: > > > > > > > On Mon, Nov 22, 2010 at 03:44:19PM -0800, Andrew Morton wrote: > > > > > On Mon, 15 Nov 2010 11:52:46 -0800 > > > > > Simon Kirby <sim@xxxxxxxxxx> wrote: > > > > > > > > > > > I noticed that CONFIG_NUMA seems to enable some more complicated > > > > > > reclaiming bits and figured it might help since most stock kernels seem > > > > > > to ship with it now. This seems to have helped, but it may just be > > > > > > wishful thinking. We still see this happening, though maybe to a lesser > > > > > > degree. (The following observations are with CONFIG_NUMA enabled.) > > > > > > > > > > > > > > Hi, > > > > > > > > As this is a NUMA machine, what is the value of > > > > /proc/sys/vm/zone_reclaim_mode ? When enabled, this reclaims memory > > > > local to the node in preference to using remote nodes. For certain > > > > workloads this performs better but for users that expect all of memory > > > > to be used, it has surprising results. > > > > > > > > If set to 1, try testing with it set to 0 and see if it makes a > > > > difference. Thanks > > > > > > Hi Mel, > > > > > > It is set to 0. It's an Intel EM64T...I only enabled CONFIG_NUMA since > > > it seemed to enable some more complicated handling, and I figured it > > > might help, but it didn't seem to. It's also required for > > > CONFIG_COMPACTION, but that is still marked experimental. > > > > > > > I'm surprised a little that you are bringing compaction up because unless > > there are high-order involved, it wouldn't make a difference. Is there > > a constant source of high-order allocations in the system e.g. a network > > card configured to use jumbo frames? A possible consequence of that is that > > reclaim is kicking in early to free order-[2-4] pages that would prevent 100% > > of memory being used. > > We /were/ using jumbo frames, but only over a local cross-over connection > to another node (for DRBD), so I disabled jumbo frames on this interface > and reconnected DRBD. Even with MTUs set to 1500, we saw GFP_ATOMIC > order=3 allocations coming from __alloc_skb: > > perf record --event kmem:mm_page_alloc --filter 'order>=3' -a --call-graph sleep 10 > perf trace > > imap-20599 [002] 1287672.803567: mm_page_alloc: page=0xffffea00004536c0 pfn=4536000 order=3 migratetype=0 gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NORETRY|GFP_COMP > > perf report shows: > > __alloc_pages_nodemask > alloc_pages_current > new_slab > __slab_alloc > __kmalloc_node_track_caller > __alloc_skb > __netdev_alloc_skb > bnx2_poll_work > > Dave was seeing these on his laptop with an Intel NIC as well. Ralf > noted that the slab cache grows in higher order blocks, so this is > normal. The GFP_ATOMIC bubbles up from *alloc_skb, I guess. Please try SLAB instead SLUB (it can be switched by kernel build option). SLUB try to use high order allocation implicitly. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>