Re: Debugging git-commit slowness on a large repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 02, 2011 at 11:17:10PM +0000, Joshua Redstone wrote:
> Hi,
> I have a git repo with about 300k commits,  150k files totaling maybe 7GB.
>  Locally committing a small change - say touching fewer than 300 bytes
> across 4 files - consistently takes over one second, which seems kinda
> slow.  This is using git 1.7.7.4 on a linux 2.6 box.  The time does not
> improve after doing a git-gc (my .git dir has maybe 250 files after a git
> gc).  The same size commit on a brand new repo takes < 10ms.  Any thoughts
> on why committing a small change seems to take a long time on larger repos?

By "same size commit" do you mean the same amount of changes, or the
same amount of files? Committing doesn't depend on the size of the
repo (by itself), but on the size of the index, which depends on the
amount of files to be committed (as git is snapshot-based). At one
point, commit forgot how to write the tree cache to the index (a
performance optimisation). Do the times improve if you run 'git
read-tree HEAD' between one commit and another? Note that this will
reset the index to the last commit, though for the tests I image you
use some variation of 'git commit -a'.

Thomas Rast wrote a patch to re-teach commit to store the tree cache,
but there were some issues and never got applied.

> 
> Fwiw, I also tried doing the same test using libgit2 (via the pygit2
> wrapper), and it was ever slower (about 6 seconds to commit the same small
> change).

I don't know about the python bindings, but on the (somewhat
unscientific) tests for libgit2's write-tree (the slow part of a
creating a commit), it performs slightly faster than git's (though I
think git's write-tree does update the tree cache, which libgit2
doesn't currently). The speed could just be a side-effect of the small
test repo. From your domain, I assume the data is not for public
consumption, but it'd be great if you could post your code to pygit2's
issue tracker so we can see how much of the slowdown comes from the
bindings or the library.

   cmn

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]