[RFC/PATCHv2 6/6] limit "contains" traversals based on commit generation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When looking for commits that contain other commits (e.g.,
via "git tag --contains"), we can end up traversing useless
portions of the graph. For example, if I am looking for a
tag that contains a commit made last week, there is not much
point in traversing portions of the history graph made five
years ago.

This optimization can provide massive speedups. For example,
doing "git tag --contains HEAD~1000" in the linux-2.6
repository goes from:

  real    0m3.139s
  user    0m3.044s
  sys     0m0.092s

to:

  real    0m0.035s
  user    0m0.028s
  sys     0m0.004s

We could use commit timestamps to know when we are going too
far back in history, but they are sometimes not trustworthy.
Extreme clock skew on committers' machines (or bugs in
commit-generating software) mean that we may stop the
traversal too early when seeing commits skewed into the
past.

Instead, we use the calculated commit generation, which is a
propery of the graph itself (but since we cache it, it's
still cheap to consult).

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
Same as previous version.

 builtin/tag.c |   20 +++++++++++++++++---
 1 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 63bce6e..df6de47 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -40,7 +40,8 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
 }
 
 static int contains_recurse(struct commit *candidate,
-			    const struct commit_list *want)
+			    const struct commit_list *want,
+			    unsigned long cutoff)
 {
 	struct commit_list *p;
 
@@ -57,9 +58,13 @@ static int contains_recurse(struct commit *candidate,
 	if (parse_commit(candidate) < 0)
 		return 0;
 
+	/* stop searching if we go too far back in time */
+	if (commit_generation(candidate) < cutoff)
+		return 0;
+
 	/* Otherwise recurse and mark ourselves for future traversals. */
 	for (p = candidate->parents; p; p = p->next) {
-		if (contains_recurse(p->item, want)) {
+		if (contains_recurse(p->item, want, cutoff)) {
 			candidate->object.flags |= TMP_MARK;
 			return 1;
 		}
@@ -70,7 +75,16 @@ static int contains_recurse(struct commit *candidate,
 
 static int contains(struct commit *candidate, const struct commit_list *want)
 {
-	return contains_recurse(candidate, want);
+	unsigned long cutoff = ULONG_MAX;
+	const struct commit_list *c;
+
+	for (c = want; c; c = c->next) {
+		unsigned long g = commit_generation(c->item);
+		if (g < cutoff)
+			cutoff = g;
+	}
+
+	return contains_recurse(candidate, want, cutoff);
 }
 
 static int show_reference(const char *refname, const unsigned char *sha1,
-- 
1.7.6.37.g989c6
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]