On Wed, Nov 20, 2024 at 02:13:50PM -0500, Taylor Blau wrote: > +static const struct git_hash_algo sha1_unsafe_algo = { > + .name = "sha1", > + .format_id = GIT_SHA1_FORMAT_ID, > + .rawsz = GIT_SHA1_RAWSZ, > + .hexsz = GIT_SHA1_HEXSZ, > + .blksz = GIT_SHA1_BLKSZ, > + .init_fn = git_hash_sha1_init_unsafe, > + .clone_fn = git_hash_sha1_clone_unsafe, > + .update_fn = git_hash_sha1_update_unsafe, > + .final_fn = git_hash_sha1_final_unsafe, > + .final_oid_fn = git_hash_sha1_final_oid_unsafe, > + .empty_tree = &empty_tree_oid, > + .empty_blob = &empty_blob_oid, > + .null_oid = &null_oid_sha1, > +}; All of the non-function fields here naturally must match the ones in the parent algo struct, or chaos ensues. That's a little fragile, but it's not like we're adding new algorithm variants a lot. The biggest risk, I guess, would be adding a new field to git_hash_algo which defaults to zero-initialization here. But again, there are only three total and they are defined near each other here, so I don't think it's a big risk overall. I think this struct is a potential landmine for hash_algo_by_ptr(): static inline int hash_algo_by_ptr(const struct git_hash_algo *p) { return p - hash_algos; } It's undefined behavior to pass in sha1_unsafe_algo to this function (but the compiler would not complain since the types are the same). I don't find it incredibly likely that somebody would want to do that on an unsafe variant, but I'm not thrilled about leaving that wart for somebody to find. If we don't care about the speed of this function, then an implementation like: for (i = 0; i < GIT_HASH_NALGOS; i++) { if (p == &hash_algos[i] || p == hash_algos[i]->unsafe) return i; } return GIT_HASH_UNKNOWN; would work. I'm not sure if that would be measurable. I was surprised at the number of places that hash_algo_by_ptr() is called. Many low-level oid functions need it because we store the integer id there rather than a direct pointer (so oidread(), oidclr(), oid_object_info_extended(), and so on). But I'd also expect the loop above to be pretty fast. So I dunno. > const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { > { > .name = NULL, > @@ -241,6 +257,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { > .unsafe_update_fn = git_hash_sha1_update_unsafe, > .unsafe_final_fn = git_hash_sha1_final_unsafe, > .unsafe_final_oid_fn = git_hash_sha1_final_oid_unsafe, > + .unsafe = &sha1_unsafe_algo, > .empty_tree = &empty_tree_oid, > .empty_blob = &empty_blob_oid, > .null_oid = &null_oid_sha1, OK, and we can leave the sha256 pointer zero-initialized, since the function handles that at runtime. Good. -Peff