Re: [PATCH] block-sha1: more good unaligned memory access candidates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 13 Aug 2009, Junio C Hamano wrote:
> 
> Wow.  Is it now faster than the arm/ and ppc/ hand-tweaked assembly?

For the good cases, yes.

For POWER, with gcc-4.4, the C code apparently outperforms the asm code on 
POWER6. The asm code is scheduled for POWER4, and I think outperforms the 
C code there. Also, when compiling in 64-bit mode (with "-m64"), at least 
some versions of gcc seem to do some stupid things and add extra zero 
extension stuff, and that performed suboptimally at least on a PPC G5.

So it's certainly not a clear case of "the C code outperforms the asm 
code", but in BenH's tests, the best numbers really did come from the C 
version. With some silly cases of at least some versions gcc screwing up 
(not reload, but zero extension), and making it noticeably slower.

IOW, the PPC situation really isn't that different from x86. 

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]