Fix "git commit directory/" performance anomaly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This trivial patch avoids re-hashing files that are already clean in the 
index. This mirrors what commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe 
did for "git add .", only for "git commit ." instead.

This improves the cold-cache case immensely, since we don't need to bring 
in all the file contents, just the index and any files dirty in the index.

Before:

	[torvalds@woody linux]$ time git commit .
	real    1m49.537s
	user    0m3.892s
	sys     0m2.432s

After:

	[torvalds@woody linux]$ time git commit .
	real    0m14.273s
	user    0m1.312s
	sys     0m0.516s

(both after doing a "echo 3 > /proc/sys/vm/drop_caches" to get cold-cache 
behaviour - even with the index optimization git still has to "lstat()" 
all the files, so with a truly cold cache, bringing all the inodes in 
will take some time).

Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---

On Fri, 10 Aug 2007, Linus Torvalds wrote:
> 
> Try this on the kernel archive (use a clean one, so these things *should* 
> all be no-ops):
> 
> 	time sh -c "git add . ; git commit"
> 
> which is nice and fast and takes just over a second for me, but then try
> 
> 	time git commit .
> 
> which *should* be nice and fast, but it takes forever, because we now 
> re-compute all the SHA1's for *every* file. Of course, if it's all in the 
> cache, it's still just 4s for me, but I tried with a cold cache, and it 
> was over half a minute!
> 
> (I don't actually ever do something like "git commit .", but I could see 
> people doing it. What I *do* do is that if I have multiple independent 
> changes, I may actually do "git commit fs" to commit just part of them, 
> and rather than list all the files, I literally just say "commit that 
> sub-tree". So this really is another valid performance issue).
> 
> Sad.
> 
> 			Linus
> 
---
 builtin-update-index.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin-update-index.c b/builtin-update-index.c
index 509369e..8d22dfa 100644
--- a/builtin-update-index.c
+++ b/builtin-update-index.c
@@ -86,9 +86,15 @@ static int process_lstat_error(const char *path, int err)
 
 static int add_one_path(struct cache_entry *old, const char *path, int len, struct stat *st)
 {
-	int option, size = cache_entry_size(len);
-	struct cache_entry *ce = xcalloc(1, size);
+	int option, size;
+	struct cache_entry *ce;
+
+	/* Was the old index entry already up-to-date? */
+	if (old && !ce_stage(old) && !ce_match_stat(old, st, 0))
+		return;
 
+	size = cache_entry_size(len);
+	ce = xcalloc(1, size);
 	memcpy(ce->name, path, len);
 	ce->ce_flags = htons(len);
 	fill_stat_cache_info(ce, st);
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux