Re: [RFC PATCH 3/6] commit-graph: enable replace-object and grafts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:

First, this commit seems to try to do two things at once, and in its
current form it should be split into adding destroy_commit_graph() and
into grafts / replacements check.  Those are separate and unconnected
features, and should be in separate patches, in my opinion.

> Create destroy_commit_graph() method to delete the commit-graph file
> when history is altered by a replace-object call. If the commit-graph
> is rebuilt after that, we will load the correct object while reading
> the commit-graph.

I'm not sure if this is the best solution.  It would work both for
invalidation, and for turning off the feature, but in my opinion there
are better solutions for both:
 - in case of invalidation, wouldn't it be better to re-create the file?
 - in case of turning off, wouldn't it be better to simply set
   core_commit_graph to false?

Nevertheless I think destroy_commit_graph(), however it would be named
at the end, would be a good to have.

> When parsing a commit, first check if the commit was grafted. If so,
> then ignore the commit-graph for that commit and insted use the parents
> loaded by parsing the commit buffer and comparing against the graft
> file.

Unless uyou turn off using generation numbers and other such indices,
this is not enough to prevent from generating incorrect results if
current commit-graph file doesn't agree with current altered view of the
project history.

Take a simple example of joining two histories using replacements
(git-replace) or grafts.  Numbers denote generation numbers:

Original history:

    A<---B<---C             <===  master
    1    2    3


    a<---b<---c<---d<---e   <===  historical
    1    2    3    4    5

Altered view of history:

    a<---b<---c<---d<---e<---[A]<---B<---C   <===  master

In the new view of history, 'e' is reachable from 'C'.  But going by
pre-altering generation numbers, gen_{pre}(e) = 5  >  gen_{pre}(C) = 3.
This means that we don't even arrive at '[A]', where replacement object
or graft changed parents compared to what's loaded from commit-graph
file, isn't it?

>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  builtin/replace.c |  3 +++
>  commit-graph.c    | 20 +++++++++++++++++++-
>  commit-graph.h    |  9 +++++++++
>  commit.c          |  5 +++++
>  4 files changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/replace.c b/builtin/replace.c
> index 9f01f3fc7f..d553aadcdc 100644
> --- a/builtin/replace.c
> +++ b/builtin/replace.c
> @@ -15,6 +15,7 @@
>  #include "parse-options.h"
>  #include "run-command.h"
>  #include "tag.h"
> +#include "commit-graph.h"
>  
>  static const char * const git_replace_usage[] = {
>  	N_("git replace [-f] <object> <replacement>"),
> @@ -468,6 +469,8 @@ int cmd_replace(int argc, const char **argv, const char *prefix)
>  		usage_msg_opt("--raw only makes sense with --edit",
>  			      git_replace_usage, options);
>  
> +	destroy_commit_graph(get_object_directory());
> +
>  	switch (cmdmode) {
>  	case MODE_DELETE:
>  		if (argc < 1)
> diff --git a/commit-graph.c b/commit-graph.c
> index e9195dfb17..95af4ed519 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -6,6 +6,7 @@
>  #include "pack.h"
>  #include "packfile.h"
>  #include "commit.h"
> +#include "dir.h"
>  #include "object.h"
>  #include "refs.h"
>  #include "revision.h"
> @@ -240,15 +241,22 @@ static struct commit_list **insert_parent_or_die(struct commit_graph *g,
>  {
>  	struct commit *c;
>  	struct object_id oid;
> +	const unsigned char *real;
>  
>  	if (pos >= g->num_commits)
>  		die("invalid parent position %"PRIu64, pos);
>  
>  	hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * pos);
> +
> +	real = lookup_replace_object(oid.hash);

So this function returns either original SHA-1 / OID of object, or if
object should be replaced then it returns the replacement object's name
(replaced recursively, if necessary).

This deals with replacements, but not with grafts (from graft file)!

> +
>  	c = lookup_commit(&oid);
>  	if (!c)
>  		die("could not find commit %s", oid_to_hex(&oid));
> -	c->graph_pos = pos;
> +
> +	if (!hashcmp(real, oid.hash))
> +		c->graph_pos = pos;

I don't quite understand this reasoning here.  The serialized
commit-graph file can either follow the current state of altered
history, or not.  In the later case the commit-graph can either follow
original history, or altered but not current view of history.

What does it matter of the object was replaced or not when marking
object as coming from graph at position 'pos'?  If commit is replaced,
it still comes from commit-graph at position 'pos', isn't it?

Also, what if the starting commit was replaced?  Then this logic
wouldn't file, is it?


I don't see how it relates to what's written in the commit message:

DS> When parsing a commit, first check if the commit was grafted. If so,
DS> then ignore the commit-graph for that commit and insted use the parents
DS> loaded by parsing the commit buffer and comparing against the graft
DS> file.

> +
>  	return &commit_list_insert(c, pptr)->next;
>  }
>  
> @@ -1019,3 +1027,13 @@ int verify_commit_graph(struct commit_graph *g)
>  
>  	return verify_commit_graph_error;
>  }
> +
> +void destroy_commit_graph(const char *obj_dir)
> +{
> +	char *graph_name;
> +	close_commit_graph();
> +
> +	graph_name = get_commit_graph_filename(obj_dir);
> +	remove_path(graph_name);
> +	FREE_AND_NULL(graph_name);
> +}

All right, though I onder if it wouldn't be better to first
delete/remove/unlink the graph file, and only then close it.

Otherwise I think you might be introducing subtle race condition here.
But I may be wrong here.

> diff --git a/commit-graph.h b/commit-graph.h
> index 9a06a5f188..1d17da1582 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -56,4 +56,13 @@ void write_commit_graph(const char *obj_dir,
>  
>  int verify_commit_graph(struct commit_graph *g);
>  
> +/*
> + * Delete the commit-graph file in the given object directory.
> + *
> + * WARNING: this deletes data, so should only be used when
> + * performing history-altering actions like replace-object
> + * or grafts.
> + */
> +void destroy_commit_graph(const char *obj_dir);

Why it is not called delete_commit_graph(), then?

> +
>  #endif
> diff --git a/commit.c b/commit.c
> index 6eaed0174c..2fe31cde77 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -403,6 +403,11 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
>  		return -1;
>  	if (item->object.parsed)
>  		return 0;
> +
> +	prepare_commit_graft();
> +	if (commit_graft_pos(item->object.oid.hash) >= 0)
> +		use_commit_graph = 0;

All right, this allegedly handles grafts.

I don't understand why you treat replacement objects in different place
than grafts; I can understand that you might want to treat them
differently (because grafts are deprecated).

> +
>  	if (use_commit_graph && parse_commit_in_graph(item))
>  		return 0;
>  	buffer = read_sha1_file(item->object.oid.hash, &type, &size);



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux