[PATCH] git-fetch: Avoid reading packed refs over and over again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When checking which tags to fetch, the old code used to call
git-show-ref --verify for _each_ remote tag. Since reading even
packed refs is not a cheap operation when there are a lot of
local refs, the code became quite slow.

This fixes it by teaching git-show-ref to filter out valid
(i.e. locally stored) refs from stdin, when passing the parameter
--filter-invalid to git-show-ref, and feeding it lines in the
form 'sha1 refname'.

Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
---
	Since this option is purely for use in git-fetch, I did not even
	bother documenting it.

	This patch would have been so much cleaner if git-fetch was written
	in C... But since it accumulated so many functions by now, I see
	not much chance for that (at least in the near future).

	In very unscientific tests, a single read_packed_refs() in the 
	lilypond repo took 0.1 seconds. Yep, that's 1/10th second. So, the 
	while loop in git-fetch took more than 10 seconds for 107 tags.

 builtin-show-ref.c |   28 +++++++++++++++++++++++++++-
 git-fetch.sh       |    2 +-
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/builtin-show-ref.c b/builtin-show-ref.c
index f6929d9..c0b55c1 100644
--- a/builtin-show-ref.c
+++ b/builtin-show-ref.c
@@ -2,8 +2,9 @@
 #include "refs.h"
 #include "object.h"
 #include "tag.h"
+#include "path-list.h"
 
-static const char show_ref_usage[] = "git show-ref [-q|--quiet] [--verify] [-h|--head] [-d|--dereference] [-s|--hash[=<length>]] [--abbrev[=<length>]] [--tags] [--heads] [--] [pattern*]";
+static const char show_ref_usage[] = "git show-ref [-q|--quiet] [--verify] [-h|--head] [-d|--dereference] [-s|--hash[=<length>]] [--abbrev[=<length>]] [--tags] [--heads] [--] [pattern*] | --filter-invalid < ref-list";
 
 static int deref_tags = 0, show_head = 0, tags_only = 0, heads_only = 0,
 	found_match = 0, verify = 0, quiet = 0, hash_only = 0, abbrev = 0;
@@ -86,6 +87,29 @@ match:
 	return 0;
 }
 
+static int add_valid(const char *refname, const unsigned char *sha1, int flag, void *cbdata)
+{
+	struct path_list *list = (struct path_list *)cbdata;
+	path_list_insert(refname, list);
+	return 0;
+}
+
+static int filter_invalid()
+{
+	static struct path_list valid_refs = { NULL, 0, 0, 0 };
+	char buf[1024];
+
+	for_each_ref(add_valid, &valid_refs);
+	while (fgets(buf, sizeof(buf), stdin)) {
+		int len = strlen(buf);
+		if (len > 0 && buf[len - 1] == '\n')
+			buf[--len] = '\0';
+		if (len < 41 || !path_list_has_path(&valid_refs, buf + 41))
+			printf("%s\n", buf);
+	}
+	return 0;
+}
+
 int cmd_show_ref(int argc, const char **argv, const char *prefix)
 {
 	int i;
@@ -153,6 +177,8 @@ int cmd_show_ref(int argc, const char **argv, const char *prefix)
 			heads_only = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--filter-invalid"))
+			return filter_invalid();
 		usage(show_ref_usage);
 	}
 	if (verify) {
diff --git a/git-fetch.sh b/git-fetch.sh
index 3feba32..d1c00db 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -474,9 +474,9 @@ case "$no_tags$tags" in
 		echo "$ls_remote_result" |
 		sed -n	-e 's|^\('"$_x40"'\)	\(refs/tags/.*\)^{}$|\1 \2|p' \
 			-e 's|^\('"$_x40"'\)	\(refs/tags/.*\)$|\1 \2|p' |
+		git-show-ref --filter-invalid |
 		while read sha1 name
 		do
-			git-show-ref --verify --quiet -- "$name" && continue
 			git-check-ref-format "$name" || {
 				echo >&2 "warning: tag ${name} ignored"
 				continue
-- 
1.4.4.1.g3c2a-dirty

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]