On Thu, May 07, 2020 at 04:40:05PM -0400, Jeff King wrote: > On Mon, May 04, 2020 at 07:14:03PM -0600, Taylor Blau wrote: > > > If callers do wish to retain this behavior, they can easily work around > > this change by doing the following: > > > > git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' | > > awk '/commit/ { print $1 }' | > > git commit-graph write --stdin-commits > > I know this came from my earlier email, but I think that recipe actually > shows how to make your input work even if --check-oids were the default. > If you really want the --check-oids behavior, you'd want the opposite: > to complain about those ones. So it's something like: > > git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' | > awk ' > !/commit/ { print "not-a-commit:"$1 } > /commit/ { print $1 } > ' | > git commit-graph write --stdin-commits > > > diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt > > index 53a650225a..fcac7d12e1 100644 > > --- a/Documentation/git-commit-graph.txt > > +++ b/Documentation/git-commit-graph.txt > > @@ -47,8 +47,10 @@ with `--stdin-commits` or `--reachable`.) > > + > > With the `--stdin-commits` option, generate the new commit graph by > > walking commits starting at the commits specified in stdin as a list > > -of OIDs in hex, one OID per line. (Cannot be combined with > > -`--stdin-packs` or `--reachable`.) > > +of OIDs in hex, one OID per line. OIDs that resolve to non-commits > > +(either directly, or by peeling tags) are silently ignored. OIDs that > > +are malformed, or do not exist generate an error. (Cannot be combined > > +with `--stdin-packs` or `--reachable`.) > > Yeah, I think these semantics are good. > > > diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c > > index 9eec68572f..3637d079fb 100644 > > --- a/builtin/commit-graph.c > > +++ b/builtin/commit-graph.c > > @@ -153,13 +153,14 @@ static int read_one_commit(struct oidset *commits, struct progress *progress, > > > > display_progress(progress, oidset_size(commits) + 1); > > > > + if (oid_object_info(the_repository, &oid, NULL) < 0) { > > + error(_("object %s does not exist"), hash); > > + return 1; > > + } > > + > > result = lookup_commit_reference_gently(the_repository, &oid, 1); > > if (result) > > oidset_insert(commits, &result->object.oid); > > - else { > > - error(_("invalid commit object id: %s"), hash); > > - return 1; > > - } > > return 0; > > } > > We can avoid the object-existence check entirely if > lookup_commit_reference_gently() gives us an answer. And we'd expect > that to be the common path. > > Also, using has_object_file() is cheaper than oid_object_info(), since > it doesn't have to resolve the type for deltas. > > So perhaps: > > result = lookup_commit_reference_gently(...); > if (result) > oidset_insert(...); > else if (has_object_file(&oid)) > ; /* not a commit; quietly ignore; > else > return error(no such object...); > > That said, I think this technique misses some cases of corruption. > You're checking that the outer-most object exists, but not any > intermediate peeled objects. I.e., lookup_commit_reference_gently() > might have failed for two reasons: > > - an object it peeled to didn't exist > > - an object it peeled to wasn't a commit > > To do it thoroughly, I think you'd have to call deref_tag() yourself and > distinguish NULL there (an error) from a result where obj->type isn't > OBJ_COMMIT (quietly ignore). Makes sense. I initially worried a little bit about what to call the error message (i.e., is is "this object doesn't exist" or "this object exists but peels to a non-commit, or an otherwise missing object"). But, I think just saying "invalid object: <hash>" is good enough here. > > enum commit_graph_write_flags { > > - COMMIT_GRAPH_WRITE_APPEND = (1 << 0), > > - COMMIT_GRAPH_WRITE_PROGRESS = (1 << 1), > > - COMMIT_GRAPH_WRITE_SPLIT = (1 << 2), > > - /* Make sure that each OID in the input is a valid commit OID. */ > > - COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3), > > - COMMIT_GRAPH_WRITE_BLOOM_FILTERS = (1 << 4), > > + COMMIT_GRAPH_WRITE_APPEND = (1 << 0), > > + COMMIT_GRAPH_WRITE_PROGRESS = (1 << 1), > > + COMMIT_GRAPH_WRITE_SPLIT = (1 << 2), > > + COMMIT_GRAPH_WRITE_BLOOM_FILTERS = (1 << 3) > > As much as I love looking at matched-indentation lists, I think this > diff is a good example of why it's not worth doing. It's much easier to > see what's going on if the first three items aren't touched. I'd > actually even leave BLOOM_FILTERS where it is, and accept the hole which > could be refilled later. > > Your patch also loses the trailing comma after the final BLOOM_FILTERS > entry. Fixed both, thanks. I'll build these eight new patches and send them shortly, thanks for your review. > -Peff Thanks, Taylor