> From: epsi@xxxxxx [mailto:epsi@xxxxxx] > Sent: Wednesday, October 14, 2009 9:49 AM > To: Woodruff, Richard; linux-omap@xxxxxxxxxxxxxxx; Premi, Sanjeev > Subject: Re: RE: RE: Memory performance / Cache problem > > Mem clock is both times 166MHz. I don't know whether are differences in cycle > access and timing, but memclock is fine. How did you physically verify this? > Following Siarhei hints of initialize the buffers (around 1.2 MByte each) > I get different results in 22kernel for use of > malloc alone > memcpy = 473.764, loop4 = 448.430, loop1 = 102.770, rand = 29.641 > calloc alone > memcpy = 405.947, loop4 = 361.550, loop1 = 95.441, rand = 21.853 > malloc+memset: > memcpy = 239.294, loop4 = 188.617, loop1 = 80.871, rand = 4.726 > > In 31kernel all 3 measures are about the same (unfortunatly low) level of > malloc+memset in 22. Yes aligned buffers can make a difference. But probably more so for small copies. Of course you must touch the memory or mprotect() it so its faulted in, but indications are you have done this. > I used a standard memcpy (think this is glib and hence not neonbased)? > To be neonbased I guess it has to be recompiled? The version of glibc in use can make a difference. CodeSourcery in 2009 release added PLD's to mem operations. This can give a good benefit. It might be you have optimized library in one case and a non-optimized in another. Regards, Richard W. -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html