"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > Before refactoring into the chunk-format API, the commit-graph parsing > logic included checks for duplicate chunks. It is unlikely that we would > desire a chunk-based file format that allows duplicate chunk IDs in the > table of contents, so add duplicate checks into > read_table_of_contents(). Makes sense. This answers a question I had while reading one of the previous steps about the design, I think. However... > diff --git a/chunk-format.c b/chunk-format.c > index 74501084cf8..1ee875df423 100644 > --- a/chunk-format.c > +++ b/chunk-format.c > @@ -14,6 +14,7 @@ struct chunk_info { > chunk_write_fn write_fn; > > const void *start; > + unsigned found:1; This defines a .found member ... > @@ -98,6 +99,7 @@ int read_table_of_contents(struct chunkfile *cf, > uint64_t toc_offset, > int toc_length) > { > + int i; > uint32_t chunk_id; > const unsigned char *table_of_contents = mfile + toc_offset; > > @@ -124,6 +126,14 @@ int read_table_of_contents(struct chunkfile *cf, > return -1; > } > > + for (i = 0; i < cf->chunks_nr; i++) { > + if (cf->chunks[i].id == chunk_id) { > + error(_("duplicate chunk ID %"PRIx32" found"), > + chunk_id); > + return -1; > + } > + } > + > cf->chunks[cf->chunks_nr].id = chunk_id; > cf->chunks[cf->chunks_nr].start = mfile + chunk_offset; > cf->chunks[cf->chunks_nr].size = next_chunk_offset - chunk_offset; ... and no new code touches it. The way duplicate is found is by having a inner loop that checks the IDs of chunks we've seen so far (quadratic, but presumably that would not matter as long as we'd be dealing with just half a dozen chunk types). Is the .found bit used for something else and needs to be added in a different step?