On Tue, Jul 28, 2020 at 09:14:42AM -0400, Derrick Stolee wrote: > On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote: > > From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> > > > > Both fill_commit_graph_info() and fill_commit_in_graph() parse > > information present in commit data chunk. Let's simplify the > > implementation by calling fill_commit_graph_info() within > > fill_commit_in_graph(). > > > > The test 'generate tar with future mtime' creates a commit with commit > > time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into > > generation number and has undefined behavior. The test used to pass as > > fill_commit_in_graph() did not read commit time from commit graph, > > reading commit date from odb instead. > > I was first confused as to why fill_commit_graph_info() did not > load the timestamp, but the reason is that it is only used by > two methods: > > 1. fill_commit_in_graph(): this actually leaves the commit in a > "parsed" state, so the date must be correct. Thus, it parses > the date out of the commit-graph. > > 2. load_commit_graph_info(): this only helps to guarantee we > know the graph_pos and generation number values. > > Perhaps add this extra context: you will _need_ the commit date > from the commit-graph in order to populate the generation number > v2 in fill_commit_graph_info(). > > > Let's fix that by setting commit time of (2 ^ 34 - 1) seconds. > > The timestamp limit placed in the commit-graph is more restrictive > than 64-bit timestamps, but as your test points out, the maximum > timestamp allowed takes place in the year 2514. That is far enough > away for all real data. > > > Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> > > --- > > commit-graph.c | 31 ++++++++++++------------------- > > t/t5000-tar-tree.sh | 4 ++-- > > 2 files changed, 14 insertions(+), 21 deletions(-) > > > > diff --git a/commit-graph.c b/commit-graph.c > > index 5d3c9bd23c..204eb454b2 100644 > > --- a/commit-graph.c > > +++ b/commit-graph.c > > @@ -735,15 +735,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, > > const unsigned char *commit_data; > > struct commit_graph_data *graph_data; > > uint32_t lex_index; > > + uint64_t date_high, date_low; > > > > while (pos < g->num_commits_in_base) > > g = g->base_graph; > > > > + if (pos >= g->num_commits + g->num_commits_in_base) > > + die(_("invalid commit position. commit-graph is likely corrupt")); > > + > > lex_index = pos - g->num_commits_in_base; > > commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; > > > > graph_data = commit_graph_data_at(item); > > graph_data->graph_pos = pos; > > + > > + date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; > > + date_low = get_be32(commit_data + g->hash_len + 12); > > + item->date = (timestamp_t)((date_high << 32) | date_low); > > + > > graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > > } > > > > @@ -758,38 +767,22 @@ static int fill_commit_in_graph(struct repository *r, > > { > > uint32_t edge_value; > > uint32_t *parent_data_ptr; > > - uint64_t date_low, date_high; > > struct commit_list **pptr; > > - struct commit_graph_data *graph_data; > > const unsigned char *commit_data; > > uint32_t lex_index; > > > > + fill_commit_graph_info(item, g, pos); > > + > > while (pos < g->num_commits_in_base) > > g = g->base_graph; > > This 'while' loop happens in both implementations, so you could > save a miniscule amount of time by placing the call to > fill_commit_graph_info() after the while loop. > > > - if (pos >= g->num_commits + g->num_commits_in_base) > > - die(_("invalid commit position. commit-graph is likely corrupt")); > > > - /* > > - * Store the "full" position, but then use the > > - * "local" position for the rest of the calculation. > > - */ > > - graph_data = commit_graph_data_at(item); > > - graph_data->graph_pos = pos; > > lex_index = pos - g->num_commits_in_base; > > - > > - commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index; > > + commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; > > I was about to complain about this change, but GRAPH_DATA_WIDTH > is a macro that does an equivalent thing (except the_hash_algo->rawsz > instead of g->hash_len). > > > > > item->object.parsed = 1; > > > > set_commit_tree(item, NULL); > > > > - date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; > > - date_low = get_be32(commit_data + g->hash_len + 12); > > - item->date = (timestamp_t)((date_high << 32) | date_low); > > - > > - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > > - > > pptr = &item->parents; > > > > edge_value = get_be32(commit_data + g->hash_len); > > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh > > index 37655a237c..1986354fc3 100755 > > --- a/t/t5000-tar-tree.sh > > +++ b/t/t5000-tar-tree.sh > > @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' > > rm -f .git/index && > > echo content >file && > > git add file && > > - GIT_COMMITTER_DATE="@68719476737 +0000" \ > > + GIT_COMMITTER_DATE="@17179869183 +0000" \ > > git commit -m "tempori parendum" > > ' > > > > @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' > > ' > > > > test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' ' > > - echo 4147 >expect && > > + echo 2514 >expect && > > tar_info future.tar | cut -d" " -f2 >actual && > > test_cmp expect actual > > ' > > > > Thanks, > -Stolee Agreed with Stolee's review, but otherwise this looks like a faithful transformation. Thanks, Taylor