[PATCH] revision: avoid work after --max-count is reached

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



During a revision traversal in which --max-count has been
specified, we decrement a counter for each revision returned
by get_revision. When it hits 0, we typically return NULL
(the exception being if we still have boundary commits to
show).

However, before we check the counter, we call get_revision_1
to get the next commit. This might involve looking at a
large number of commits if we have restricted the traversal
(e.g., we might traverse until we find the next commit whose
diff actually matches a pathspec).

There's no need to make this get_revision_1 call when our
counter runs out. If we are not in --boundary mode, we will
just throw away the result and immediately return NULL. If
we are in --boundary mode, then we will still throw away the
result, and then start showing the boundary commits.
However, as git_revision_1 does not impact the boundary
list, it should not have an impact.

In most cases, avoiding this work will not be especially
noticeable. However, in some cases, it can make a big
difference:

  [before]
  $ time git rev-list -1 origin Documentation/RelNotes/1.7.11.2.txt
  8d141a1d562abb31f27f599dbf6e10a6c06ed73e

  real    0m0.301s
  user    0m0.280s
  sys     0m0.016s

  [after]
  $ time git rev-list -1 origin Documentation/RelNotes/1.7.11.2.txt
  8d141a1d562abb31f27f599dbf6e10a6c06ed73e

  real    0m0.010s
  user    0m0.008s
  sys     0m0.000s

Note that the output is produced almost instantaneously in
the first case, and then git uselessly spends a long time
looking for the next commit to touch that file (but there
isn't one, and we traverse all the way down to the roots).

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
I'd really like more sets of eyes to make sure this won't
cause a weird regression. Calling get_revision_1 does tweak
the flags on objects as it traverses, so I'm worried about
some other code relying on this side effect of get_revision
in some way. That does seem a bit crazy to me, though.

 revision.c | 39 +++++++++++++++++++--------------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/revision.c b/revision.c
index 5b81a92..7e39655 100644
--- a/revision.c
+++ b/revision.c
@@ -2361,29 +2361,28 @@ static struct commit *get_revision_internal(struct rev_info *revs)
 	}
 
 	/*
-	 * Now pick up what they want to give us
+	 * If our max_count counter has reached zero, then we are done. We
+	 * don't simply return NULL because we still might need to show
+	 * boundary commits. But we want to avoid calling get_revision_1, which
+	 * might do a considerable amount of work finding the next commit only
+	 * for us to throw it away.
+	 *
+	 * If it is non-zero, then either we don't have a max_count at all
+	 * (-1), or it is still counting, in which case we decrement.
 	 */
-	c = get_revision_1(revs);
-	if (c) {
-		while (0 < revs->skip_count) {
-			revs->skip_count--;
-			c = get_revision_1(revs);
-			if (!c)
-				break;
+	if (revs->max_count) {
+		c = get_revision_1(revs);
+		if (c) {
+			while (0 < revs->skip_count) {
+				revs->skip_count--;
+				c = get_revision_1(revs);
+				if (!c)
+					break;
+			}
 		}
-	}
 
-	/*
-	 * Check the max_count.
-	 */
-	switch (revs->max_count) {
-	case -1:
-		break;
-	case 0:
-		c = NULL;
-		break;
-	default:
-		revs->max_count--;
+		if (revs->max_count > 0)
+			revs->max_count--;
 	}
 
 	if (c)
-- 
1.7.11.35.gbaf554e.dirty
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]