Re: [PATCH] describe: Don’t look up commits with --exact-match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Anders Kaseorg <andersk@xxxxxxxxxxx> writes:

> From 2ad1e58b8f6e9c117c77748b6e8b85227d9d5412 Mon Sep 17 00:00:00 2001
> From: Anders Kaseorg <andersk@xxxxxxxxxxx>
> Subject: [PATCH 1/2] describe: Use for_each_rawref
>
> Donât waste time checking for dangling refs; they wouldnât affect the
> output of âgit describeâ anyway.  Although this doesnât gain much
> performance by itself, it does in conjunction with the next commit.
>
> Signed-off-by: Anders Kaseorg <andersk@xxxxxxxxxxx>
> ---
>  builtin/describe.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/builtin/describe.c b/builtin/describe.c
> index 43caff2..700f740 100644
> --- a/builtin/describe.c
> +++ b/builtin/describe.c
> @@ -418,7 +418,7 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
>  		return cmd_name_rev(i + argc, args, prefix);
>  	}
>  
> -	for_each_ref(get_name, NULL);
> +	for_each_rawref(get_name, NULL);
>  	if (!found_names && !always)
>  		die("No names found, cannot describe anything.");

It is Ok during a review cycle that is not meant as the final submission,
but please follow Documentation/SubmittingPatches for the final round
(i.e. not multiple-changes in a single mail), as mentioned by Jonathan.

> From ce8a2ab9cf80247c2834d21f36f63cedb794e62f Mon Sep 17 00:00:00 2001
> From: Anders Kaseorg <andersk@xxxxxxxxxxx>
> Subject: describe: Donât look up commits with --exact-match
>
> This makes âgit describe --exact-match HEADâ about 15 times faster on
> a cold cache (2.3s instead of 35s) in a linux-2.6 repository with many
> packed tags.  Thatâs a huge win for the interactivity of the __git_ps1
> shell prompt helper.

Good description of the motivation.  Care to describe how it is achieved?

I may be misreading your intention, but my understanding of the thinking
behind what the new code does is...

 - When the object name given from the command line happens to match one
   of the tags exactly (either a lightweight tag in which case we can just
   compare the object names to detect the match, or an annotated tag in
   which case we only need to parse the tag but not the underlying commit
   to do the comparison), we do not have to parse any commit object.

 - Other cases we would need to parse commit objects and traverse the
   history.

 - Hence, the code can be optimized for the case to describe an exact
   match by trying exact object name match first and then loading commit
   objects only when no exact matches found.

 - When we are only looking for exact matches (which is _not_ a rare
   corner case), we won't ever fall back to the "parse commit objects and
   traverse the history" codepath.  So it is important to lazily parse
   commit objects in order to optimize for this special case.

I however think this may hurt when more than one objects are asked to be
described and there is no exact match.  The fallback codepath that lazily
runs lookup_commit_reference_gently() and uses replace_name() to compute
the best name needs to run only once, but this part of the code seems to
run it for the elements on the names list once per describe() invocation
that needs to fall back to this codepath.  Would it remove this regression
if we make it run only once?

I don't think this affects correctness, though.

> @@ -259,6 +255,12 @@ static void describe(const char *arg, int last_one)
>  	if (debug)
>  		fprintf(stderr, "searching to describe %s\n", arg);
>  
> +	for (e = names; e; e = e->next) {
> +		struct commit *c = lookup_commit_reference_gently(e->peeled, 1);
> +		if (c && replace_name(c->util, e->prio, e->sha1, &e->tag))
> +			c->util = e;
> +	}
> +
>  	list = NULL;
>  	cmit->object.flags = SEEN;
>  	commit_list_insert(cmit, &list);

As you can see from its "if (!e->tag)" codepath, replace_name() is not
designed to be called excessively (e.g. we will end up running the
codepath for an annotated tag that we should know we won't find a tag).

Also, "struct tag *tag" does not need to exist in add_to_known_names()
anymore; a NULL is assigned to it and its only use is to get assigned
to e->tag.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]