Taylor Blau <me@xxxxxxxxxxxx> writes: > On Thu, Mar 03, 2022 at 05:30:44PM +0100, Ævar Arnfjörð Bjarmason wrote: >> >> On Wed, Mar 02 2022, Taylor Blau wrote: >> >> > Consolidate these into a single definition in chunk-format.h. It's not >> > clear that this is the best header to define this function in, but it >> > should do for now. >> > [...] >> > + >> > +uint8_t oid_version(const struct git_hash_algo *algop) >> > +{ >> > + switch (hash_algo_by_ptr(algop)) { >> > + case GIT_HASH_SHA1: >> > + return 1; >> > + case GIT_HASH_SHA256: >> > + return 2; >> >> Not a new issue, but I wonder why these don't return hash_algo_by_ptr >> aka GIT_HASH_WHATEVER here. I.e. this is the same as this more >> straightforward & obvious code that avoids re-hardcoding the magic >> constants: > > Hmm. Certainly the value returned by hash_algo_by_ptr() works for SHA-1 > and SHA-256, but writes may want to use a different value for future > hashes. Not that this couldn't be changed then, but my feeling is that > the existing code is clearer since it avoids the reader having to jump > to hash_algo_by_ptr()'s implementation to figure out what it returns. If we promise that everywhere in file formats where we identify what hash is used, we write "1" for SHA1 and "2" for SHA256, it would be natural to define GIT_HASH_SHA1 to "1" and GIT_HASH_SHA256 to "2". And readers do not have to "figure out", if that is a clearly written guideline to represent the hash used in file formats. As written, the readers who -assumes- such a guideline is there must figure out from hash.h that GIT_HASH_SHA1 is 1 and GIT_HASH_SHA256 is 2 to be convinced that the above code is correct. Now, hash.h says GIT_HASH_SHA1 is 1 and GIT_HASH_SHA256 is 2. So int oidv = hash_algo_by_ptr(algop) switch (oidv) { case GIT_HASH_SHA1: case GIT_HASH_SHA256: return oidv; default: die(); } should work already. To put it differently, if this didn't work, we should renumber GIT_HASH_SHA1 and GIT_HASH_SHA256 to make it work, I would think. If not, we have a huge mess on our hands, as constants used in on-disk file formats is hard (almost impossible) to change. An overly generic function name oid_version() cannot be justified unless the same constants are used everywhere. I see hits from 'git grep oid_version' in chunk-format.c (obviously) commit-graph.c midx.c pack-write.c so presumably these types of files are using the "canonical" numbering. And when we introduce GIT_HASH_SHA3 or whatever, we should give it a number that this function can return (i.e. from the range 3..255).