On Wed, 18 Mar 2015, Andreas Hollmann wrote: > Hi, > > im looking into memory allocation and page placement on NUMA systems. > > My oberservation is that not all pages touched by a thread running on a certain > node result in pages put on the node of the thread. > > What I did > > - allocating memory with posix_memalign() on page boundaries. > - pinning one thread per node (sched_setaffinity()) and initializing > the array using OpenMP > - checking the page placement using move_pages() > - move pages to the node I expect > - checking the page placement again > > > Here is the output: > > Thread 3 initialzed 4194304 elements. > Thread 0 initialzed 4194304 elements. > Thread 1 initialzed 4194304 elements. > Thread 2 initialzed 4194304 elements. > > Each thread had the same number of elements. So it's fine. > > Node 0 Pages 7688 > Node 1 Pages 8192 > Node 2 Pages 8192 > Node 3 Pages 8696 > > The pages however are not evenly distributed. The difference > is arround 13 % (max-min/min). > > Node 0 Pages 8192 > Node 1 Pages 8192 > Node 2 Pages 8192 > Node 3 Pages 8192 > > After using move_pages() it looks like the result I expect. > > Kernel Version is: 3.16.4-1 > > This was done for a 128 MiB array. Smaller arrays show even worse page > placements. > The bahavior is repeatable, so get exactly the same distribution each > time I run the application. > > Is there any explanation for this bahavior? Probably depends on the setting of /proc/sys/vm/zone_reclaim_mode, which determines whether we try to reclaim locally vs. immediately allocating remote memory when local memory is depleted. -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html