Re: [PATCH 1/1] commit-graph.c: avoid unnecessary tag dereference when merging

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 20 Mar 2020 22:01:54 -0700

Taylor Blau <me@xxxxxxxxxxxx> writes:

> When performing a 'git commit-graph write' with '--split', the
> commit-graph machinery calls 'merge_commit_graph()' after deciding on a
> split strategy to optionally clean up any existing commit-graph
> layers that were made obsolete by the split strategy [1].
>
> At this time, 'merge_commit_graph()' checks each commit that it writes
> into the merged graph to make sure that it still exists in the object
> store.
>
> To do this, it uses 'lookup_commit_reference_gently()', which accepts
> either a commit object, or a tag that refers to a commit. However, since
> all 'oid' arguments passed to this function are from within the
> commit-graphs being merged, we never pass a commit reference, and so any
> time we spend in 'deref_tag()' is wasted.

Ahh, so my question on the cover letter was utterly off the mark.
It is that feeding a commit to deref_tag() is unneeded.  It is quite
surprising to hear that deref_tag() call is _so_ expensive that it
wastes 7% of the total cycles.

The patch itself looks good.

Thanks.

>  	for (i = 0; i < g->num_commits; i++) {
>  		struct object_id oid;
> -		struct commit *result;
> +		struct commit *result = NULL;
>  
>  		display_progress(ctx->progress, i + 1);
>  
>  		load_oid_from_graph(g, i + offset, &oid);
>  
>  		/* only add commits if they still exist in the repo */
> -		result = lookup_commit_reference_gently(ctx->r, &oid, 1);
> +		if (repo_has_object_file(ctx->r, &oid)) {
> +			result = lookup_commit(ctx->r, &oid);
> +			if (repo_parse_commit(ctx->r, result))
> +				result = NULL;
> +		}
>  
>  		if (result) {
>  			ctx->commits.list[ctx->commits.nr] = result;