Hi Srikar, On Sun, Oct 14, 2012 at 12:10:19AM +0530, Srikar Dronamraju wrote: > * Andrea Arcangeli <aarcange@xxxxxxxxxx> [2012-10-04 01:50:42]: > > > Hello everyone, > > > > This is a new AutoNUMA27 release for Linux v3.6. > > > > > Here results of autonumabenchmark on a 328GB 64 core with ht disabled > comparing v3.6 with autonuma27. *snip* > numa01: 1805.19 1907.11 1866.39 -3.88% Interesting. So numa01 should be improved in autonuma28fast. Not sure why the hard binds show any difference, but I'm more concerned in optimizing numa01. I get the same results from hard bindings on upstream or autonuma, strange. Could you repeat only numa01 with the origin/autonuma28fast branch? Also if you could post the two pdf convergence chart generated by numa01 on autonuma27 and autonuma28fast, I think that would be interesting to see the full effect and why it is faster. I only had the time for a quick push after having the idea added in autonuma28fast (which is yet improved compared to autonuma28), but I've been told already that it's dealing with numa01 on the 8 node very well as expected. numa01 in the 8 node is a workload without a perfect solution (other than MADV_INTERLEAVE). Full convergence preventing cross-node traffic is impossible because there are 2 processes spanning over 8 nodes and all process memory is touched by all threads constantly. Yet autonuma28fast should deal optimally that scenario too. As a side note: numa01 on the 2 node instead converges fully (2 processes + 2 nodes = full convergence). numa01 on 2 nodes or >2nodes is a very different kind of test. I'll release an autonuma29 behaving like 28fast if there are no surprises. The new algorithm change in 28fast will also save memory once I rewrite it properly. Thanks! Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>