Re: [PATCH 3/6] commit-graph: consolidate fill_commit_graph_info

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 28, 2020 at 09:14:42AM -0400, Derrick Stolee wrote:
> On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
> > From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
> >
> > Both fill_commit_graph_info() and fill_commit_in_graph() parse
> > information present in commit data chunk. Let's simplify the
> > implementation by calling fill_commit_graph_info() within
> > fill_commit_in_graph().
> >
> > The test 'generate tar with future mtime' creates a commit with commit
> > time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
> > generation number and has undefined behavior. The test used to pass as
> > fill_commit_in_graph() did not read commit time from commit graph,
> > reading commit date from odb instead.
>
> I was first confused as to why fill_commit_graph_info() did not
> load the timestamp, but the reason is that it is only used by
> two methods:
>
> 1. fill_commit_in_graph(): this actually leaves the commit in a
>    "parsed" state, so the date must be correct. Thus, it parses
>    the date out of the commit-graph.
>
> 2. load_commit_graph_info(): this only helps to guarantee we
>    know the graph_pos and generation number values.
>
> Perhaps add this extra context: you will _need_ the commit date
> from the commit-graph in order to populate the generation number
> v2 in fill_commit_graph_info().
>
> > Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.
>
> The timestamp limit placed in the commit-graph is more restrictive
> than 64-bit timestamps, but as your test points out, the maximum
> timestamp allowed takes place in the year 2514. That is far enough
> away for all real data.
>
> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
> > ---
> >  commit-graph.c      | 31 ++++++++++++-------------------
> >  t/t5000-tar-tree.sh |  4 ++--
> >  2 files changed, 14 insertions(+), 21 deletions(-)
> >
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 5d3c9bd23c..204eb454b2 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -735,15 +735,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
> >  	const unsigned char *commit_data;
> >  	struct commit_graph_data *graph_data;
> >  	uint32_t lex_index;
> > +	uint64_t date_high, date_low;
> >
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
> >
> > +	if (pos >= g->num_commits + g->num_commits_in_base)
> > +		die(_("invalid commit position. commit-graph is likely corrupt"));
> > +
> >  	lex_index = pos - g->num_commits_in_base;
> >  	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
> >
> >  	graph_data = commit_graph_data_at(item);
> >  	graph_data->graph_pos = pos;
> > +
> > +	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > +	date_low = get_be32(commit_data + g->hash_len + 12);
> > +	item->date = (timestamp_t)((date_high << 32) | date_low);
> > +
> >  	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> >  }
> >
> > @@ -758,38 +767,22 @@ static int fill_commit_in_graph(struct repository *r,
> >  {
> >  	uint32_t edge_value;
> >  	uint32_t *parent_data_ptr;
> > -	uint64_t date_low, date_high;
> >  	struct commit_list **pptr;
> > -	struct commit_graph_data *graph_data;
> >  	const unsigned char *commit_data;
> >  	uint32_t lex_index;
> >
> > +	fill_commit_graph_info(item, g, pos);
> > +
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
>
> This 'while' loop happens in both implementations, so you could
> save a miniscule amount of time by placing the call to
> fill_commit_graph_info() after the while loop.
>
> > -	if (pos >= g->num_commits + g->num_commits_in_base)
> > -		die(_("invalid commit position. commit-graph is likely corrupt"));
>
> > -	/*
> > -	 * Store the "full" position, but then use the
> > -	 * "local" position for the rest of the calculation.
> > -	 */
> > -	graph_data = commit_graph_data_at(item);
> > -	graph_data->graph_pos = pos;
> >  	lex_index = pos - g->num_commits_in_base;
> > -
> > -	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
> > +	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
>
> I was about to complain about this change, but GRAPH_DATA_WIDTH
> is a macro that does an equivalent thing (except the_hash_algo->rawsz
> instead of g->hash_len).
>
> >
> >  	item->object.parsed = 1;
> >
> >  	set_commit_tree(item, NULL);
> >
> > -	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > -	date_low = get_be32(commit_data + g->hash_len + 12);
> > -	item->date = (timestamp_t)((date_high << 32) | date_low);
> > -
> > -	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> > -
> >  	pptr = &item->parents;
> >
> >  	edge_value = get_be32(commit_data + g->hash_len);
> > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> > index 37655a237c..1986354fc3 100755
> > --- a/t/t5000-tar-tree.sh
> > +++ b/t/t5000-tar-tree.sh
> > @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
> >  	rm -f .git/index &&
> >  	echo content >file &&
> >  	git add file &&
> > -	GIT_COMMITTER_DATE="@68719476737 +0000" \
> > +	GIT_COMMITTER_DATE="@17179869183 +0000" \
> >  		git commit -m "tempori parendum"
> >  '
> >
> > @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
> >  '
> >
> >  test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' '
> > -	echo 4147 >expect &&
> > +	echo 2514 >expect &&
> >  	tar_info future.tar | cut -d" " -f2 >actual &&
> >  	test_cmp expect actual
> >  '
> >
>
> Thanks,
> -Stolee

Agreed with Stolee's review, but otherwise this looks like a faithful
transformation.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux