On Tue, May 28, 2024 at 09:33:07AM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > > > On Tue, May 28, 2024 at 06:32:01AM +0000, Philip Peterson via GitGitGadget wrote: > >> diff --git a/apply.c b/apply.c > >> index 901b67e6255..364c05fbd06 100644 > >> --- a/apply.c > >> +++ b/apply.c > >> @@ -3218,13 +3218,13 @@ static int apply_binary(struct apply_state *state, > >> return 0; /* deletion patch */ > >> } > >> > >> - if (has_object(the_repository, &oid, 0)) { > >> + if (has_object(state->repo, &oid, 0)) { > >> /* We already have the postimage */ > >> enum object_type type; > >> unsigned long size; > >> char *result; > >> > >> - result = repo_read_object_file(the_repository, &oid, &type, > >> + result = repo_read_object_file(state->repo, &oid, &type, > >> &size); > >> if (!result) > >> return error(_("the necessary postimage %s for " > > > > We call `get_oid_hex()` in this function, which ultimately ends up > > accessing `the_repository` via `the_hash_algo`. We should likely convert > > those to `repo_get_oid()`. > > > > There are also other accesses to `the_hash_algo` in this function, which > > should be converted to use `state->repo->hash_algo`, as well. > > We as a more experienced developers should come up with a way to > help developers who are less familiar with the API set, so that they > can chip in this effort to wean ourselves off of globals. > > Would a bug like the ones you pointed out be easily caught by say > running "GIT_TEST_DEFAULT_HASH=sha256 make test"? No, I don't think so. I was also thinking about different approaches to this problem overall. Ideally, I would want to catch this on the programmatic level so that we do not even have to detect this at runtime. And I think this should be feasible by introducing something similar to the USE_THE_INDEX_VARIABLE macro, which we have recently removed. If we had a USE_THE_REPOSITORY_VARIABLE macro that guards declarations of: - `the_repository` - `the_hash_algo` - Functions that implicitly rely on either of the above. Then you could prove that a given code unit does not rely on any of the above anymore by not declaring that macro. In fact, these patches prompted me to give this a go this morning. And while the changes are of course trivial by themselves, I quickly discovered that they are also pointless because we almost always rely on either of the above. The most important reason of this is "hash.h", where `struct object_id` falls back to `the_hash_algo` in case its own hash algorithm wasn't set. Ideally, this would never be the case. But a quick test shows that there are many places where we do rely on the fallback value, mostly around null OIDs. Fixing this would be a necessary prerequisite. Another important benefit of the macro would be that we stop working against a moving target. New files shouldn't declare it, and once a file has been refactored and the corresponding macro was removed we would know that we don't add new dependencies on either of the above by accident. Patrick
Attachment:
signature.asc
Description: PGP signature