Hi Abhishek, In short: everything is all right, except for the now duplicated test names in t5000 after this commit. "Abhishek Kumar via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> > > Both fill_commit_graph_info() and fill_commit_in_graph() parse > information present in commit data chunk. Let's simplify the > implementation by calling fill_commit_graph_info() within > fill_commit_in_graph(). > > fill_commit_graph_info() used to not load committer data from commit data > chunk. However, with the corrected committer date, we have to load > committer date to calculate generation number value. Nice writeup, however the last sentence would in my opinion read better in the future tense: we don't use generation number v2 yet. For example: However, with upcoming switch to using corrected committer date as generation number v2, we will have to load committer date to compute generation number value anyway. Or something like that - notice the minor addition and changes. The following is slightly unrelated change, but we agreed that it would be better to not separate them; the need for change to the t5000 test is caused by the change described above. > > e51217e15 (t5000: test tar files that overflow ustar headers, > 30-06-2016) introduced a test 'generate tar with future mtime' that > creates a commit with committer date of (2 ^ 36 + 1) seconds since > EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus > committer time overflows into generation number (within CDAT chunk) and > has undefined behavior. > > The test used to pass as fill_commit_graph_info() would not set struct > member `date` of struct commit and loads committer date from the object > database, generating a tar file with the expected mtime. I think it should be s/loads/load/, as in "would load", but I am not a native English speaker. > > However, with corrected commit date, we will load the committer date > from CDAT chunk (truncated to lower 34-bits to populate the generation > number. Thus, Git sets date and generates tar file with the truncated > mtime. > > The ustar format (the header format used by most modern tar programs) > only has room for 11 (or 12, depending om some implementations) octal > digits for the size and mtime of each files. > > Thus, setting a timestamp of 2 ^ 33 + 1 would overflow the 11-octal > digit implementations while still fitting into commit data chunk. > > Since we want to test 12-octal digit implementations of ustar as well, > let's modify the existing test to no longer use commit-graph file. The description above is for me does not make it entirely clear that we add new test for handling possible 11-octal digit overflow nearly identical to the existing one, and turn off use of commit-graph file for test that checks handling 12-octal digit overflow. > Signed-off-by: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> > --- > commit-graph.c | 27 ++++++++++----------------- > t/t5000-tar-tree.sh | 20 +++++++++++++++++++- > 2 files changed, 29 insertions(+), 18 deletions(-) > > diff --git a/commit-graph.c b/commit-graph.c > index 94503e584b..e8362e144e 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -749,15 +749,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, > const unsigned char *commit_data; > struct commit_graph_data *graph_data; > uint32_t lex_index; > + uint64_t date_high, date_low; > > while (pos < g->num_commits_in_base) > g = g->base_graph; > > + if (pos >= g->num_commits + g->num_commits_in_base) > + die(_("invalid commit position. commit-graph is likely corrupt")); > + > lex_index = pos - g->num_commits_in_base; > commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; > > graph_data = commit_graph_data_at(item); > graph_data->graph_pos = pos; > + > + date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; > + date_low = get_be32(commit_data + g->hash_len + 12); > + item->date = (timestamp_t)((date_high << 32) | date_low); > + > graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > } > > @@ -772,38 +781,22 @@ static int fill_commit_in_graph(struct repository *r, > { > uint32_t edge_value; > uint32_t *parent_data_ptr; > - uint64_t date_low, date_high; > struct commit_list **pptr; > - struct commit_graph_data *graph_data; > const unsigned char *commit_data; > uint32_t lex_index; > > while (pos < g->num_commits_in_base) > g = g->base_graph; > > - if (pos >= g->num_commits + g->num_commits_in_base) > - die(_("invalid commit position. commit-graph is likely corrupt")); > + fill_commit_graph_info(item, g, pos); > > - /* > - * Store the "full" position, but then use the > - * "local" position for the rest of the calculation. > - */ > - graph_data = commit_graph_data_at(item); > - graph_data->graph_pos = pos; > lex_index = pos - g->num_commits_in_base; > - > commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index; > > item->object.parsed = 1; > > set_commit_tree(item, NULL); > > - date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; > - date_low = get_be32(commit_data + g->hash_len + 12); > - item->date = (timestamp_t)((date_high << 32) | date_low); > - > - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; > - > pptr = &item->parents; > > edge_value = get_be32(commit_data + g->hash_len); All right, looks good for me. Here second change begins. > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh > index 3ebb0d3b65..8f41cdc509 100755 > --- a/t/t5000-tar-tree.sh > +++ b/t/t5000-tar-tree.sh > @@ -431,11 +431,29 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our huge size' ' > test_cmp expect actual > ' > > +test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' > + rm -f .git/index && > + echo foo >file && > + git add file && > + GIT_COMMITTER_DATE="@17179869183 +0000" \ > + git commit -m "tempori parendum" > +' > + > +test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' > + git archive HEAD >future.tar > +' > + > +test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' ' > + echo 2514 >expect && > + tar_info future.tar | cut -d" " -f2 >actual && > + test_cmp expect actual > +' > + Everything is all right, except we now have duplicated test names. Perhaps in the three following tests we should use 'far-far-future commit' and 'far future mtime' in place of current 'far-future commit' and 'future mtime' for tests checking handling 12-digital ditgits overflow, or add description how far the future is, for example 'far-future commit (2^11 + 1)', etc. > test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' > rm -f .git/index && > echo content >file && > git add file && > - GIT_COMMITTER_DATE="@68719476737 +0000" \ > + GIT_TEST_COMMIT_GRAPH=0 GIT_COMMITTER_DATE="@68719476737 +0000" \ > git commit -m "tempori parendum" > ' Best, -- Jakub Narębski