Re: [PATCH v4 04/10] commit-graph: return 64-bit generation number

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Abhishek,

"Abhishek Kumar via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
>
> In a preparatory step, let's return timestamp_t values from
> commit_graph_generation(), use timestamp_t for local variables and
> define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead.

I think it would be easier to understand if it was explicitely said what
this preparatory step prepares for, e.g.:

  In a preparatory step for introducing corrected commit dates as
  generation number, let's return timestamp_t values from...

Or even

  generation number, let's change the return type of
  commit_graph_generation() to timestamp_t, and use ...

Otherwise it looks good.

>
> We rename GENERATION_NUMBER_MAX to GENERATION_NUMBER_V1_MAX to
> represent the largest topological level we can store in the commit data
> chunk.
>
> With corrected commit dates implemented, we will have two such *_MAX
> variables to denote the largest offset and largest topological level
> that can be stored.

All right, nice explanation.

>
> Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>

Note that there are two changes that are not mentioned in the commit
message, namely adding 'const'-ness to generation_a/b local variables in
commit_gen_cmp() from commit-graph.c, and switching from
GENERATION_NUMBER_ZERO to GENERATION_NUMBER_INFINITY as the default
(initial) value for 'max_generation' in repo_in_merge_bases_many().

While the former is a simple "while-at-it" change that shouldn't affect
correctness, the latter needs an explanation (or fixing if it is wrong).

> ---
>  commit-graph.c | 22 +++++++++++-----------
>  commit-graph.h |  4 ++--
>  commit-reach.c | 36 ++++++++++++++++++------------------
>  commit-reach.h |  2 +-
>  commit.c       |  4 ++--
>  commit.h       |  4 ++--
>  revision.c     | 10 +++++-----
>  upload-pack.c  |  2 +-
>  8 files changed, 42 insertions(+), 42 deletions(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index e8362e144e..bfc532de6f 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -99,7 +99,7 @@ uint32_t commit_graph_position(const struct commit *c)
>  	return data ? data->graph_pos : COMMIT_NOT_FROM_GRAPH;
>  }
>  
> -uint32_t commit_graph_generation(const struct commit *c)
> +timestamp_t commit_graph_generation(const struct commit *c)

All right.

>  {
>  	struct commit_graph_data *data =
>  		commit_graph_data_slab_peek(&commit_graph_data_slab, c);
> @@ -144,8 +144,8 @@ static int commit_gen_cmp(const void *va, const void *vb)
>  	const struct commit *a = *(const struct commit **)va;
>  	const struct commit *b = *(const struct commit **)vb;
>  
> -	uint32_t generation_a = commit_graph_data_at(a)->generation;
> -	uint32_t generation_b = commit_graph_data_at(b)->generation;
> +	const timestamp_t generation_a = commit_graph_data_at(a)->generation;
> +	const timestamp_t generation_b = commit_graph_data_at(b)->generation;

All right... but this also adds 'const' qualifier.  I understand that
you don't want to create separate commit for this "while at it"
change...

>  	/* lower generation commits first */
>  	if (generation_a < generation_b)
>  		return -1;
> @@ -1350,7 +1350,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>  					_("Computing commit graph generation numbers"),
>  					ctx->commits.nr);
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation;
> +		timestamp_t generation = commit_graph_data_at(ctx->commits.list[i])->generation;

All right.

>  
>  		display_progress(ctx->progress, i + 1);
>  		if (generation != GENERATION_NUMBER_INFINITY &&
> @@ -1383,8 +1383,8 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>  				data->generation = max_generation + 1;
>  				pop_commit(&list);
>  
> -				if (data->generation > GENERATION_NUMBER_MAX)
> -					data->generation = GENERATION_NUMBER_MAX;
> +				if (data->generation > GENERATION_NUMBER_V1_MAX)
> +					data->generation = GENERATION_NUMBER_V1_MAX;

All right, this is the other mentioned change.

>  			}
>  		}
>  	}
> @@ -2404,8 +2404,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
>  	for (i = 0; i < g->num_commits; i++) {
>  		struct commit *graph_commit, *odb_commit;
>  		struct commit_list *graph_parents, *odb_parents;
> -		uint32_t max_generation = 0;
> -		uint32_t generation;
> +		timestamp_t max_generation = 0;
> +		timestamp_t generation;

All right.

>  
>  		display_progress(progress, i + 1);
>  		hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
> @@ -2469,11 +2469,11 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags)
>  			continue;
>  
>  		/*
> -		 * If one of our parents has generation GENERATION_NUMBER_MAX, then
> -		 * our generation is also GENERATION_NUMBER_MAX. Decrement to avoid
> +		 * If one of our parents has generation GENERATION_NUMBER_V1_MAX, then
> +		 * our generation is also GENERATION_NUMBER_V1_MAX. Decrement to avoid
>  		 * extra logic in the following condition.
>  		 */
> -		if (max_generation == GENERATION_NUMBER_MAX)
> +		if (max_generation == GENERATION_NUMBER_V1_MAX)
>  			max_generation--;

All right.  Nice fixing a comment too.

>  
>  		generation = commit_graph_generation(graph_commit);
> diff --git a/commit-graph.h b/commit-graph.h
> index f8e92500c6..8be247fa35 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -144,12 +144,12 @@ void disable_commit_graph(struct repository *r);
>  
>  struct commit_graph_data {
>  	uint32_t graph_pos;
> -	uint32_t generation;
> +	timestamp_t generation;
>  };

All right.

>  
>  /*
>   * Commits should be parsed before accessing generation, graph positions.
>   */
> -uint32_t commit_graph_generation(const struct commit *);
> +timestamp_t commit_graph_generation(const struct commit *);
>  uint32_t commit_graph_position(const struct commit *);
>  #endif

All right.

> diff --git a/commit-reach.c b/commit-reach.c
> index 50175b159e..20b48b872b 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -32,12 +32,12 @@ static int queue_has_nonstale(struct prio_queue *queue)
>  static struct commit_list *paint_down_to_common(struct repository *r,
>  						struct commit *one, int n,
>  						struct commit **twos,
> -						int min_generation)
> +						timestamp_t min_generation)
>  {
>  	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
>  	struct commit_list *result = NULL;
>  	int i;
> -	uint32_t last_gen = GENERATION_NUMBER_INFINITY;
> +	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;

All right.

>  
>  	if (!min_generation)
>  		queue.compare = compare_commits_by_commit_date;
> @@ -58,10 +58,10 @@ static struct commit_list *paint_down_to_common(struct repository *r,
>  		struct commit *commit = prio_queue_get(&queue);
>  		struct commit_list *parents;
>  		int flags;
> -		uint32_t generation = commit_graph_generation(commit);
> +		timestamp_t generation = commit_graph_generation(commit);

All right.

>  
>  		if (min_generation && generation > last_gen)
> -			BUG("bad generation skip %8x > %8x at %s",
> +			BUG("bad generation skip %"PRItime" > %"PRItime" at %s",

All right; nice of you noticing this issue.

>  			    generation, last_gen,
>  			    oid_to_hex(&commit->object.oid));
>  		last_gen = generation;
> @@ -177,12 +177,12 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
>  		repo_parse_commit(r, array[i]);
>  	for (i = 0; i < cnt; i++) {
>  		struct commit_list *common;
> -		uint32_t min_generation = commit_graph_generation(array[i]);
> +		timestamp_t min_generation = commit_graph_generation(array[i]);
>  
>  		if (redundant[i])
>  			continue;
>  		for (j = filled = 0; j < cnt; j++) {
> -			uint32_t curr_generation;
> +			timestamp_t curr_generation;
>  			if (i == j || redundant[j])
>  				continue;
>  			filled_index[filled] = j;

All right.

> @@ -321,7 +321,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
>  {
>  	struct commit_list *bases;
>  	int ret = 0, i;
> -	uint32_t generation, max_generation = GENERATION_NUMBER_ZERO;
> +	timestamp_t generation, max_generation = GENERATION_NUMBER_INFINITY;

The change of type from uint32_t to timestamp_t is expected, but the
change from GENERATION_NUMBER_ZERO to GENERATION_NUMBER_INFINITY is not.

This might be caused by the fact that repo_in_merge_bases_many()
switched from using min_generation and GENERATION_NUMBER_INFINITY to
using max_generation and GENERATION_NUMBER_ZERO. Or the reverse: I see
one version on https://github.com/git/git, and other version in 'master'
pulled from https://github.com/git-for-windows/git

Certainly max_generation should be paired with GENERATION_NUMBER_ZERO,
and min_generation with GENERATION_NUMBER_INFINITY.

>  
>  	if (repo_parse_commit(r, commit))
>  		return ret;
> @@ -470,7 +470,7 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
>  static enum contains_result contains_test(struct commit *candidate,
>  					  const struct commit_list *want,
>  					  struct contains_cache *cache,
> -					  uint32_t cutoff)
> +					  timestamp_t cutoff)

All right.

Sidenote: this parameter should probably be named gen_cutoff, for
consistency and better readability (but that was the existing state),
but this would also mean more changes.


>  {
>  	enum contains_result *cached = contains_cache_at(cache, candidate);
>  
> @@ -506,11 +506,11 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
>  {
>  	struct contains_stack contains_stack = { 0, 0, NULL };
>  	enum contains_result result;
> -	uint32_t cutoff = GENERATION_NUMBER_INFINITY;
> +	timestamp_t cutoff = GENERATION_NUMBER_INFINITY;

Sidenote: this variable should probably be named gen_cutoff, for
consistency and better readability (but that was the existing state).
However changing it would pollute this commit with unrelated changes;
it is not that big of an isseu that it *requires* fixing.

>  	const struct commit_list *p;
>  
>  	for (p = want; p; p = p->next) {
> -		uint32_t generation;
> +		timestamp_t generation;
>  		struct commit *c = p->item;
>  		load_commit_graph_info(the_repository, c);
>  		generation = commit_graph_generation(c);

All right.

> @@ -566,8 +566,8 @@ static int compare_commits_by_gen(const void *_a, const void *_b)
>  	const struct commit *a = *(const struct commit * const *)_a;
>  	const struct commit *b = *(const struct commit * const *)_b;
>  
> -	uint32_t generation_a = commit_graph_generation(a);
> -	uint32_t generation_b = commit_graph_generation(b);
> +	timestamp_t generation_a = commit_graph_generation(a);
> +	timestamp_t generation_b = commit_graph_generation(b);

All right.

>  
>  	if (generation_a < generation_b)
>  		return -1;
> @@ -580,7 +580,7 @@ int can_all_from_reach_with_flag(struct object_array *from,
>  				 unsigned int with_flag,
>  				 unsigned int assign_flag,
>  				 time_t min_commit_date,
> -				 uint32_t min_generation)
> +				 timestamp_t min_generation)
>  {
>  	struct commit **list = NULL;
>  	int i;

All right.

> @@ -681,13 +681,13 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to,
>  	time_t min_commit_date = cutoff_by_min_date ? from->item->date : 0;
>  	struct commit_list *from_iter = from, *to_iter = to;
>  	int result;
> -	uint32_t min_generation = GENERATION_NUMBER_INFINITY;
> +	timestamp_t min_generation = GENERATION_NUMBER_INFINITY;
>  
>  	while (from_iter) {
>  		add_object_array(&from_iter->item->object, NULL, &from_objs);
>  
>  		if (!parse_commit(from_iter->item)) {
> -			uint32_t generation;
> +			timestamp_t generation;
>  			if (from_iter->item->date < min_commit_date)
>  				min_commit_date = from_iter->item->date;
>

All right.

> @@ -701,7 +701,7 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to,
>  
>  	while (to_iter) {
>  		if (!parse_commit(to_iter->item)) {
> -			uint32_t generation;
> +			timestamp_t generation;
>  			if (to_iter->item->date < min_commit_date)
>  				min_commit_date = to_iter->item->date;
>

All right.

> @@ -741,13 +741,13 @@ struct commit_list *get_reachable_subset(struct commit **from, int nr_from,
>  	struct commit_list *found_commits = NULL;
>  	struct commit **to_last = to + nr_to;
>  	struct commit **from_last = from + nr_from;
> -	uint32_t min_generation = GENERATION_NUMBER_INFINITY;
> +	timestamp_t min_generation = GENERATION_NUMBER_INFINITY;
>  	int num_to_find = 0;
>  
>  	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
>  
>  	for (item = to; item < to_last; item++) {
> -		uint32_t generation;
> +		timestamp_t generation;
>  		struct commit *c = *item;
>  
>  		parse_commit(c);

All right.

> diff --git a/commit-reach.h b/commit-reach.h
> index b49ad71a31..148b56fea5 100644
> --- a/commit-reach.h
> +++ b/commit-reach.h
> @@ -87,7 +87,7 @@ int can_all_from_reach_with_flag(struct object_array *from,
>  				 unsigned int with_flag,
>  				 unsigned int assign_flag,
>  				 time_t min_commit_date,
> -				 uint32_t min_generation);
> +				 timestamp_t min_generation);
>  int can_all_from_reach(struct commit_list *from, struct commit_list *to,
>  		       int commit_date_cutoff);
>

All right.

> diff --git a/commit.c b/commit.c
> index f53429c0ac..3b488381d5 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -731,8 +731,8 @@ int compare_commits_by_author_date(const void *a_, const void *b_,
>  int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused)
>  {
>  	const struct commit *a = a_, *b = b_;
> -	const uint32_t generation_a = commit_graph_generation(a),
> -		       generation_b = commit_graph_generation(b);
> +	const timestamp_t generation_a = commit_graph_generation(a),
> +			  generation_b = commit_graph_generation(b);
>

All right (assuming that the indent after change looks all right; but
even if it doesn't t would be a very minor issue).

>  	/* newer commits first */
>  	if (generation_a < generation_b)
> diff --git a/commit.h b/commit.h
> index 5467786c7b..33c66b2177 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -11,8 +11,8 @@
>  #include "commit-slab.h"
>  
>  #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
> -#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> -#define GENERATION_NUMBER_MAX 0x3FFFFFFF
> +#define GENERATION_NUMBER_INFINITY ((1ULL << 63) - 1)
> +#define GENERATION_NUMBER_V1_MAX 0x3FFFFFFF
>  #define GENERATION_NUMBER_ZERO 0
>

All right, we redefine GENERATION_NUMBER_INFINITY and rename
GENERATION_NUMBER_MAX.

>  struct commit_list {
> diff --git a/revision.c b/revision.c
> index c97abcdde1..2861f1c45c 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -3308,7 +3308,7 @@ define_commit_slab(indegree_slab, int);
>  define_commit_slab(author_date_slab, timestamp_t);
>  
>  struct topo_walk_info {
> -	uint32_t min_generation;
> +	timestamp_t min_generation;
>  	struct prio_queue explore_queue;
>  	struct prio_queue indegree_queue;
>  	struct prio_queue topo_queue;

All right.

> @@ -3354,7 +3354,7 @@ static void explore_walk_step(struct rev_info *revs)
>  }
>  
>  static void explore_to_depth(struct rev_info *revs,
> -			     uint32_t gen_cutoff)
> +			     timestamp_t gen_cutoff)
>  {
>  	struct topo_walk_info *info = revs->topo_walk_info;
>  	struct commit *c;

All right.

> @@ -3397,7 +3397,7 @@ static void indegree_walk_step(struct rev_info *revs)
>  }
>  
>  static void compute_indegrees_to_depth(struct rev_info *revs,
> -				       uint32_t gen_cutoff)
> +				       timestamp_t gen_cutoff)
>  {
>  	struct topo_walk_info *info = revs->topo_walk_info;
>  	struct commit *c;

All right.

> @@ -3455,7 +3455,7 @@ static void init_topo_walk(struct rev_info *revs)
>  	info->min_generation = GENERATION_NUMBER_INFINITY;
>  	for (list = revs->commits; list; list = list->next) {
>  		struct commit *c = list->item;
> -		uint32_t generation;
> +		timestamp_t generation;
>  
>  		if (repo_parse_commit_gently(revs->repo, c, 1))
>  			continue;

All right.

> @@ -3516,7 +3516,7 @@ static void expand_topo_walk(struct rev_info *revs, struct commit *commit)
>  	for (p = commit->parents; p; p = p->next) {
>  		struct commit *parent = p->item;
>  		int *pi;
> -		uint32_t generation;
> +		timestamp_t generation;
>  
>  		if (parent->object.flags & UNINTERESTING)
>  			continue;

All right.

> diff --git a/upload-pack.c b/upload-pack.c
> index 3b858eb457..fdb82885b6 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -497,7 +497,7 @@ static int got_oid(struct upload_pack_data *data,
>  
>  static int ok_to_give_up(struct upload_pack_data *data)
>  {
> -	uint32_t min_generation = GENERATION_NUMBER_ZERO;
> +	timestamp_t min_generation = GENERATION_NUMBER_ZERO;
>  
>  	if (!data->have_obj.nr)
>  		return 0;

All right.

The only thing to check is if you have changed the type in all the
places that need it. My cursory examination shows that those are all
places than need fixing.

Note that the 'generation' variable in git-name-rev, git-fsck and in
git-show-branch (snd sha1-name.c) means something different.

Also, 'first_generation' variable in generation_numbers_enabled() (part
of commit-graph.c) examines and will examine generation number v1 i.e.
topological levels, and do not need type change... though it may require
name change in some time in the future; the generation number
computation path also does not require change type, though variables
would be renamed in the future commit.

Best,
-- 
Jakub Narębski




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux