Re: git and larger trees, not so fast?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 9 Aug 2007, Junio C Hamano wrote:
> > 
> > (I didn't test it, though, maybe I missed something).
> 
> I do not think the change affects the normal codepath.  The
> one-liner patch to git-commit.sh touches the codepath that
> updates the index used only to write out the partial commit, and
> losing the cached stat info from that index does not matter, as
> that index is removed immediately after writing the tree out and
> is never compared with working tree as far as I can tell.

You are, of course, mostly right. Using "-m" there is largely pointless, 
since it's a throw-away index, and we'll only ever use the exact paths 
that were given to us.

However, it does actually matter for one case: the case where you give a 
directory name or other name pattern, resulting in a *lot* of filenames. 
In that case, the commit will end up piping that (potentially very large) 
list to "git update-index --add --remove --stdin", and that will now mean 
that they *all* get their SHA1's recomputed.

Of course, that was the other performance bug that we already knew about 
(except we were thining "git add .", and fixed that case). So we're 
already slow at it - but we *shouldn't* be.

Try this on the kernel archive (use a clean one, so these things *should* 
all be no-ops):

	time sh -c "git add . ; git commit"

which is nice and fast and takes just over a second for me, but then try

	time git commit .

which *should* be nice and fast, but it takes forever, because we now 
re-compute all the SHA1's for *every* file. Of course, if it's all in the 
cache, it's still just 4s for me, but I tried with a cold cache, and it 
was over half a minute!

(I don't actually ever do something like "git commit .", but I could see 
people doing it. What I *do* do is that if I have multiple independent 
changes, I may actually do "git commit fs" to commit just part of them, 
and rather than list all the files, I literally just say "commit that 
sub-tree". So this really is another valid performance issue).

Sad.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux