"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > - 1-byte (reserved for later use) > - Current clients should ignore this value. > + 1-byte number (B) of base commit-graphs > + We infer the length (H*B) of the Base Graphs chunk > + from this value. > > CHUNK LOOKUP: > > @@ -92,6 +93,12 @@ CHUNK DATA: > positions for the parents until reaching a value with the most-significant > bit on. The other bits correspond to the position of the last parent. > > + Base Graphs List (ID: {'B', 'A', 'S', 'E'}) [Optional] > + This list of H-byte hashes describe a set of B commit-graph files that > + form a commit-graph chain. The graph position for the ith commit in this > + file's OID Lookup chunk is equal to i plus the number of commits in all > + base graphs. If B is non-zero, this chunk must exist. Hmph, an obvious alternative design would be to make the base list self describing without using the "reserved for future use" byte, which would allow more than 256 bases (not that being able to use 300 bases is necessarily a useful feature) and also leave the reserved byte unused. It's not like being able to detect discrepancy (e.g. B!=0 but BASE chunk is missing, and/or BASE chunk appears but B==0) adds value by offering more protection against file corruption, so I am wondering why it is a good idea to consume the reserved byte for this. > + if (n && !g->chunk_base_graphs) { > + warning(_("commit-graph has no base graphs chunk")); > + return 0; > + } > + n> while (n) { > n--; > + > + if (!oideq(&oids[n], &cur_g->oid) || > + !hasheq(oids[n].hash, g->chunk_base_graphs + g->hash_len * n)) { Here, load_commit_graph_chain() that goes over the on-disk chain file that lists graph files called us with 'n', which can run up to the number of graph files listed in that file---and that number can be more than what is recorded in the graph-list chunk, in which case we are over-reading with this hasheq(), right? It seems that parse_commit_graph() only cares about the beginning of each chunk, and a crafted graph file can record two chunks with a gap in between, or two chunks that overlap, and nobody would notice. Is that true? Wasted space in the file between two chunks (i.e. a gap) is not necessarily bad and may not be a warning-worthy thing, but two chunks that overlap is probably not a good idea and worth noticing. The only sanity check it seems to do is to forbid chunks of the same kind from appearing twice.