On Wed, Sep 27, 2023 at 3:55 PM Eric W. Biederman <ebiederm@xxxxxxxxx> wrote: > Two basic functions are provided: > - convert_object_file Takes an object file it's type and hash algorithm > and converts it into the equivalent object file that would > have been generated with hash algorithm "to". > > For blob objects there is no converstion to be done and it is an > error to use this function on them. s/converstion/conversion/ > For commit, tree, and tag objects embedded oids are replaced by the > oids of the objects they refer to with those objects and their > object ids reencoded in with the hash algorithm "to". Signatures > are rearranged so that they remain valid after the object has > been reencoded. > > - repo_oid_to_algop which takes an oid that refers to an object file > and returns the oid of the equavalent object file generated > with the target hash algorithm. s/equavalent/equivalent/ > The pair of files object-file-convert.c and object-file-convert.h are > introduced to hold as much of this logic as possible to keep this > conversion logic cleanly separated from everything else and in the > hopes that someday the code will be clean enough git can support > compiling out support for sha1 and the various conversion functions. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Just some minor comments below, many of which are subjective style-related observations, thus not necessarily actionable, but also one or two legitimate questions. > diff --git a/object-file-convert.c b/object-file-convert.c > @@ -0,0 +1,57 @@ > +int repo_oid_to_algop(struct repository *repo, const struct object_id *src, > + const struct git_hash_algo *to, struct object_id *dest) > +{ > + /* > + * If the source alogirthm is not set, then we're using the > + * default hash algorithm for that object. > + */ s/alogirthm/algorithm/ > + const struct git_hash_algo *from = > + src->algo ? &hash_algos[src->algo] : repo->hash_algo; > + > + if (from == to) { > + if (src != dest) > + oidcpy(dest, src); > + return 0; > + } > + return -1; > +} On this project, we usually get the simple cases out of the way first, which often reduces the indentation level, making the code easier to digest at a glance. So, it would be typical to write this as: if (from != to) return -1 if (src != dest) oidcpy(dest, src); return 0; or even: if (from != to) return -1 if (src == dest) return 0; oidcpy(dest, src); return 0; This way, for instance, the reader doesn't get to the end of the function and then have to scan backward to understand the condition of the `return -1`. > +int convert_object_file(struct strbuf *outbuf, > + const struct git_hash_algo *from, > + const struct git_hash_algo *to, > + const void *buf, size_t len, > + enum object_type type, > + int gentle) > +{ > + int ret; > + > + /* Don't call this function when no conversion is necessary */ > + if ((from == to) || (type == OBJ_BLOB)) > + die("Refusing noop object file conversion"); Several comments... Style: we usually reduce the noise level by dropping the extra parentheses: if (from == to || type == OBJ_BLOB) Does this condition represent a programming error or a runtime error triggerable by some input? If a programming error, then use BUG() rather than die(). If a triggerable runtime error, then... * start user-facing messages with lowercase rather than capitalized word * make the user-facing message localizable so readers of other languages can digest it die(_("refusing do-nothing object conversion")); On the other hand, don't make BUG() messages localizable. > + switch (type) { > + case OBJ_COMMIT: > + case OBJ_TREE: > + case OBJ_TAG: > + default: > + /* Not implemented yet, so fail. */ > + ret = -1; > + break; > + } > + if (!ret) > + return 0; > + if (gentle) { > + strbuf_release(outbuf); > + return ret; > + } This function appears to be a mere skeleton at the moment, so it's difficult to judge at this point whether you are using `outbuf` as a bag of bytes or as a legitimate string container. If the latter, then the API may be reasonable, but if you're using it as a bag-of-bytes, then it feels like you're leaking an implementation detail into the API. > + die(_("Failed to convert object from %s to %s"), > + from->name, to->name); s/Failed/failed/ For people trying to diagnose this problem, would it be helpful to present more information about the failed conversion, such as object type and perhaps even its OID? > diff --git a/object-file-convert.h b/object-file-convert.h > @@ -0,0 +1,24 @@ > +int repo_oid_to_algop(struct repository *repo, const struct object_id *src, > + const struct git_hash_algo *to, struct object_id *dest); I suppose the function name is pretty much self-explanatory to those familiar with the underlying concepts, but it might still be helpful to add a comment explaining what the function does. > +/* > + * Convert an object file from one hash algorithm to another algorithm. > + * Return -1 on failure, 0 on success. > + */ > +int convert_object_file(struct strbuf *outbuf, > + const struct git_hash_algo *from, > + const struct git_hash_algo *to, > + const void *buf, size_t len, > + enum object_type type, > + int gentle);