Re: [PATCH on master v2] revision: use commit graph in get_reference()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:

> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
> ...
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.

Sounds good.

> Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
> ---
> This patch is now on master.

OK.  

Obviously that won't apply to the base for v1 without conflicts, and
it of course applies cleanly on 'master', but the result of doing so
will cause the same conflicts when sb/more-repo-in-api is merged on
top, which means that the same conflicts need to be resolved if this
wants to be merged to 'next' (or 'pu', FWIW).

> diff --git a/commit-graph.c b/commit-graph.c
> index 40c855f185..a571b523b7 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -374,24 +375,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
>  	}
>  }
>  
> -static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
> +static struct commit *parse_commit_in_graph_one(struct repository *r,
> +						struct commit_graph *g,
> +						struct commit *shell,
> +						const struct object_id *oid)

Now the complexity of the behaviour of this function deserves to be
documented in a comment in front.  Let me see if I can get it
correctly without such a comment by explaining the function aloud.

The caller may or may not have already obtained an in-core commit
object for a given object name, so shell could be NULL but otherwise
it could be used for optimization.  When shell==NULL, the function
looks up the commit object using the oid parameter instead.  The
returned in-core commit has the parents etc. filled as if we ran
parse_commit() on it.  If the commit is not yet in the graph, the
caller may get a NULL even if the commit exists.

>  {
>  	uint32_t pos;
>  
> -	if (item->object.parsed)
> -		return 1;
> +	if (shell && shell->object.parsed)
> +		return shell;
>  
> -	if (find_commit_in_graph(item, g, &pos))
> -		return fill_commit_in_graph(item, g, pos);
> +	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
> +		pos = shell->graph_pos;
> +	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
> +		/* bsearch_graph sets pos */

Please spell an empty statement like so:

		; /* comment */

> +	} else {
> +		return NULL;

We come here when the commit (either came from shell or from oid) is
not found by bsearch_graph().  "Is the caller prepared for it, and
how?" is a natural question a reader would have.  Let's read on.

> +	}
>  
> -	return 0;
> +	if (!shell) {
> +		shell = lookup_commit(r, oid);
> +		if (!shell)
> +			return NULL;
> +	}
> +
> +	fill_commit_in_graph(shell, g, pos);
> +	return shell;
>  }
>  
> -int parse_commit_in_graph(struct repository *r, struct commit *item)
> +struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
> +				     const struct object_id *oid)
>  {
>  	if (!prepare_commit_graph(r))
>  		return 0;
> -	return parse_commit_in_graph_one(r->objects->commit_graph, item);
> +	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
> +					 oid);
>  }
>  
>  void load_commit_graph_info(struct repository *r, struct commit *item)
> @@ -1025,7 +1043,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
>  		}
>  
>  		graph_commit = lookup_commit(r, &cur_oid);
> -		if (!parse_commit_in_graph_one(g, graph_commit))
> +		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
>  			graph_report("failed to parse %s from commit-graph",
>  				     oid_to_hex(&cur_oid));
>  	}
> diff --git a/commit-graph.h b/commit-graph.h
> index 9db40b4d3a..8b7b5985dc 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -13,16 +13,20 @@ struct commit;
>  char *get_commit_graph_filename(const char *obj_dir);
>  
>  /*
> - * Given a commit struct, try to fill the commit struct info, including:
> + * If the given commit (identified by shell->object.oid or oid) is in the
> + * commit graph, returns a commit struct (reusing shell if it is not NULL)
> + * including the following info:
>   *  1. tree object
>   *  2. date
>   *  3. parents.
>   *
> - * Returns 1 if and only if the commit was found in the packed graph.
> + * If not, returns NULL. See parse_commit_buffer() for the fallback after this
> + * call.
>   *
> - * See parse_commit_buffer() for the fallback after this call.
> + * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
>   */

OK, the eventual caller is the caller of this thing, which should
have been prepared to see NULL for a commit that is too new.  So
that should be OK.

> -int parse_commit_in_graph(struct repository *r, struct commit *item);
> +struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
> +				     const struct object_id *oid);
>  
>  /*
>   * It is possible that we loaded commit contents from the commit buffer,
> diff --git a/commit.c b/commit.c
> index d13a7bc374..88eb580c5a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -456,7 +456,7 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
>  		return -1;
>  	if (item->object.parsed)
>  		return 0;
> -	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
> +	if (use_commit_graph && parse_commit_in_graph(the_repository, item, NULL))
>  		return 0;
>  	buffer = read_object_file(&item->object.oid, &type, &size);
>  	if (!buffer)
> diff --git a/revision.c b/revision.c
> index 13e0519c02..05fddb5880 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -213,7 +213,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  {
>  	struct object *object;
>  
> -	object = parse_object(revs->repo, oid);
> +	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
> +	if (!object)
> +		object = parse_object(revs->repo, oid);

OK and this is such a caller.  I think a general rule of thumb is
that we need to access recent history a lot more often than the
older part of the history, and having to fall back for more recent
commits feels a bit disturbing, but I do not see an easy way to
reverse the performance characteristics offhand.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux