Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > Add and use a hash_short_id_by_algo() function. As noted in the > comment being modified here (added in [1]) the intention wasn't to > have these end up in on-disk formats, but since [2], [3] and [3] double [3]? > that's been the case, and there's an outstanding patch to add another > format that uses these[5]. > > So let's expose this functionality as a documented utility function, > instead of copy/pasting this code in various places. > > Replacing the die() in the existing functions with a BUG() might be > overzelous, it's correct for the case of > e.g. write_commit_graph_file() and write_midx_header(), but we also > use this for parsing on-disk files, e.g. in parse_commit_graph(). If we know the offending data can come from outside, not from literals in the code, then there is no "might be", such a use of BUG() is simply wrong. > We could add a "gently" version of this, but for now I think that > worrying about the distinction would be worrying too much. If we ever > end up parsing such files that'll almost certainly be a bug in our own > writing code, so the distinction would be rather academic, even though It would be a file written by an ancient buggy version of our code, or a buggy third-party reimplementation of Git. It could be that a new version of Git is using a yet-to-be-invented algorithm this version of Git does not know about. The distinction matters in that "The version of Git I downloaded and built last week out of the latest release tag said BUG" should mean only one thing: that version that reports BUG() is the culprit, not some random other thing we do not even know where it came from that left a corrupt data on disk.