Re: [PATCH] for-each-ref: avoid loading objects to print %(objectname)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 26, 2013 at 10:35:17AM +0200, Thomas Rast wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> > index 752f5cb..2b4b9a9 100755
> [...]
> > +test_atom head *objectname ''
> > +test_atom head *objecttype ''
> [...]
> > +test_atom tag *objectname '67a36f10722846e891fbada1ba48ed035de75581'
> > +test_atom tag *objecttype 'commit'
> 
> Can you quote the *?  I may have become somewhat paranoid, but still.
> This is the first use of the *<field> syntax, and test_atom seems
> written to correctly quote its arguments, so why risk it? :-)

Yeah, that is a very reasonable suggestion. And the patch is still in
pu, so it's not too late to squash.

Junio, here is a resend with the asterisks quoted, to replace what's in
jk/for-each-ref-skip-parsing.  I also double-checked that test_atom
keeps them properly quoted (it does).

-- >8 --
Subject: for-each-ref: avoid loading objects to print %(objectname)

If you ask for-each-ref to print each ref and its object,
like:

  git for-each-ref --format='%(objectname) %(refname)'

this should involve little more work than looking at the ref
files (and packed-refs) themselves. However, for-each-ref
will actually load each object from disk just to print its
sha1. For most repositories, this isn't a big deal, but it
can be noticeable if you have a large number of refs to
print. Here are best-of-five timings for the command above
on a repo with ~10K refs:

  [before]
  real    0m0.112s
  user    0m0.092s
  sys     0m0.016s

  [after]
  real    0m0.014s
  user    0m0.012s
  sys     0m0.000s

This patch checks for %(objectname) and %(objectname:short)
before we actually parse the object (and the rest of the
code is smart enough to avoid parsing if we have filled all
of our placeholders).

Note that we can't simply move the objectname parsing code
into the early loop. If the "deref" form %(*objectname) is
used, then we do need to parse the object in order to peel
the tag. So instead of moving the code, we factor it out
into a separate function that can be called for both cases.

While we're at it, we add some basic tests for the
dereferenced placeholders, which were not tested at all
before. This helps ensure we didn't regress that case.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 builtin/for-each-ref.c  | 29 ++++++++++++++++++++---------
 t/t6300-for-each-ref.sh |  4 ++++
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 1d4083c..d096051 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -205,6 +205,22 @@ static void *get_obj(const unsigned char *sha1, struct object **obj, unsigned lo
 	return buf;
 }
 
+static int grab_objectname(const char *name, const unsigned char *sha1,
+			    struct atom_value *v)
+{
+	if (!strcmp(name, "objectname")) {
+		char *s = xmalloc(41);
+		strcpy(s, sha1_to_hex(sha1));
+		v->s = s;
+		return 1;
+	}
+	if (!strcmp(name, "objectname:short")) {
+		v->s = xstrdup(find_unique_abbrev(sha1, DEFAULT_ABBREV));
+		return 1;
+	}
+	return 0;
+}
+
 /* See grab_values */
 static void grab_common_values(struct atom_value *val, int deref, struct object *obj, void *buf, unsigned long sz)
 {
@@ -225,15 +241,8 @@ static void grab_common_values(struct atom_value *val, int deref, struct object
 			v->ul = sz;
 			v->s = s;
 		}
-		else if (!strcmp(name, "objectname")) {
-			char *s = xmalloc(41);
-			strcpy(s, sha1_to_hex(obj->sha1));
-			v->s = s;
-		}
-		else if (!strcmp(name, "objectname:short")) {
-			v->s = xstrdup(find_unique_abbrev(obj->sha1,
-							  DEFAULT_ABBREV));
-		}
+		else if (deref)
+			grab_objectname(name, obj->sha1, v);
 	}
 }
 
@@ -676,6 +685,8 @@ static void populate_value(struct refinfo *ref)
 			}
 			continue;
 		}
+		else if (!deref && grab_objectname(name, ref->objectname, v))
+			continue;
 		else
 			continue;
 
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 752f5cb..da5fb6c 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -58,6 +58,8 @@ test_atom head parent ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
+test_atom head '*objectname' ''
+test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@xxxxxxxxxxx> 1151939924 +0200'
 test_atom head authorname 'A U Thor'
 test_atom head authoremail '<author@xxxxxxxxxxx>'
@@ -91,6 +93,8 @@ test_atom tag parent ''
 test_atom tag numparent ''
 test_atom tag object '67a36f10722846e891fbada1ba48ed035de75581'
 test_atom tag type 'commit'
+test_atom tag '*objectname' '67a36f10722846e891fbada1ba48ed035de75581'
+test_atom tag '*objecttype' 'commit'
 test_atom tag author ''
 test_atom tag authorname ''
 test_atom tag authoremail ''
-- 
1.8.4.1.898.g8bf8a41.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]