Re: Why Git is so fast

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 30, 2009, Steven Noonan <steven@xxxxxxxxxxxxxx> wrote:
> A bit off topic, but the results are rather interesting to me, and I
> think I see a weakness in how GCC is doing this on Intel. Someone
> please correct me if I'm wrong, but the PowerPC code seems much better
> because it can yield very high instruction-level parallelism. It does
> 5 loads and then 5 stores, using 4 registers for temporary storage and
> 2 registers for pointers.
>
> I realize the Intel x86 architecture is quite constrained in that it
> has so few general purpose registers, but there has to be better code
> than what GCC emitted above. It seems like the processor would stall
> because of the quantity of sequential inter-dependent instructions
> that can't be done in parallel (mov to memory that depends on a mov to
> eax, etc).

There aren't any unnecessary dependencies.  Take this sequence:

1:        movl    (%edx), %eax
2:        movl    %eax, (%ecx)
3:        movl    4(%edx), %eax
4:        movl    %eax, 4(%ecx)

There are two unavoidable dependencies - #2 depends on #1, and #4
depends on #3.  #3 does not depend on #2, even though they both
use %eax, because #3 is a write to %eax.  So whatever was in %eax
before #3 is irrelevant.  The processor knows this and will use
register renaming to execute #1 and #3 in parallel, and #2 and #4
in parallel.

James
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]