Re: Fixing the glibc adobe flash incompatibility

Dave Jones <davej@xxxxxxxxxx> · Thu, 18 Nov 2010 11:15:57 -0500

On Thu, Nov 18, 2010 at 04:23:56PM +0100, Jakub Jelinek wrote:

 > It is very sad that Intel/AMD just didn't make sure rep movsb
 > isn't the fastest copying sequence on all of their CPUs,
 > which underneath could do whatever magic based on size and src/dst
 > alignment (e.g. for small length handle it in hw so it is as quick as
 > possible, for larger sizes perhaps handle it in microcode) - rep movsb
 > can be easily inlined and is quite short as well.  But on many, especially 
 > recent, CPUs it performs very badly compared to these much larger SSE* optimized
 > routines.
 > 
 > If you want exact numbers, best ask Intel folks who wrote and tuned the
 > SSE4.2 memcpy routine.

I wonder if the Intel people who benchmarked memcpy throughput also benchmarked
the increased context switch time that will happen now that the kernels lazy-fpu
state saving is effectively disabled every time something calls memcpy.

	Dave
-- 
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel