On Thu, Jan 30, 2020 at 05:32:16PM -0300, Matheus Tavares wrote: > The motivation for this patchset is another series I'm working on, to > make git-grep stop adding submodules' odbs to the alternates list. With > that, parse_object() will have to be called with the subdmodules' struct > repository. But it seemed that this function doesn't pass on the > received repo to some inner calls, which, instead, always use > the_repository. This series seeks to fix these inconsistencies. I read over the patches and they all seem sensible. There may be spots you missed where these functions are subtly using the_repository under the hood (or where you used the_repository but could have used some repository pointer tucked away in a struct). But even if that is the case, it seems clear that all of the changes are going in the correct direction, and we can continue to iterate on top. > Note: I also tried to replace some uses of the_hash_algo with the struct > git_hash_algo from the received repository, for consitency. (In > practice, I'm not sure if this is very useful right now, but maybe it > will be relevant for the change to SHA256?) Still, many functions in > parse_object()'s call chain call oid_to_hex(), which uses the_hash_algo. > Since changing this would require a much bigger operation, I decided to > leave it as is, for now. I'm not sure I look forward to a day where oid_to_hex() needs to take an extra parameter, just because it will make it more annoying to use. But we'll have to go there eventually unless we plan to forbid mixing of repositories with different hashes within the same process. The hash transition document says: To convert recorded submodule pointers, you need to have the converted submodule repository in place. The translation table of the submodule can be used to look up the new hash. which I take to mean that the local copy of the submodule needs to match the superproject hash (this says nothing about what the upstream owner of the submodule wants to do; we'd be translating on the fly to the new hash in the local repo). So using the_hash_algo would work either way. I don't think we're particularly interested in supporting multiple unrelated repositories within the same process. While that would be convenient for some cases (e.g., you have a million repositories and want to look at all of their trees without creating a million processes), I don't think it's a goal anyone is really working towards with this "struct repository" layer. > Note II: Despite receiving a repo through the apply_state struct, > apply.c:apply_binary() call functions which uses the_repository > internally. Because of that, I used the_hash_algo in this function in > patch 6. Should I change it to use apply_state->repo->hash_algo > anyway? I think this follows my "it's OK for now because we're going in the right direction from above". I.e., we probably do eventually want to use apply_state->repo->hash_algo, just like we probably do eventually want all those functions to take a repository argument. But by even using the_hash_algo, you've at least marked the spot for later fixing (the very final step of this long process will be eradicating all of the_hash_algo and the_repository from the bottom of the call-stack on up). -Peff