On 4/4/2021 2:09 AM, Junio C Hamano wrote: > "Chinmoy via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > >> diff --git a/sparse-index.c b/sparse-index.c >> index 95ea17174da3..e4323ffd81db 100644 >> --- a/sparse-index.c >> +++ b/sparse-index.c >> @@ -128,12 +128,14 @@ int set_sparse_index_config(struct repository *repo, int enable) >> int convert_to_sparse(struct index_state *istate) >> { >> int test_env; >> + struct repository *r = the_repository; >> + >> if (istate->split_index || istate->sparse_index || >> !core_apply_sparse_checkout || !core_sparse_checkout_cone) >> return 0; >> >> if (!istate->repo) >> - istate->repo = the_repository; >> + istate->repo = r; >> > > I am not quite sure if this is a reasonable conversion. Surely it > would not make the compiler barf, but are we sure that the caller of > convert_to_sparse() wants us to work on the_repository instance and > no other repository? As an istate has a .repo member, it seems to > me that using istate->repo would be a lot saner approach than > assigning the_repository upfront to r. It might be even needed, if > cache_tree_update() wants to take a repository instance, to ensure > that all callers to this helper sets istate->repo before they call > it so that the above "if not set yet, use the_repository" code does > not have to kick in. > >> /* >> * The GIT_TEST_SPARSE_INDEX environment variable triggers the >> @@ -161,7 +163,7 @@ int convert_to_sparse(struct index_state *istate) >> return -1; >> } >> >> - if (cache_tree_update(istate, 0)) { >> + if (cache_tree_update(r, istate, 0)) { > > And this looks like a bad conversion. It may happen to do the same > thing, but the flow of the logic up to this point in the function > was to make sure istate->repo is not empty by filling it if it is > not yet set, and update the cache tree of that istate. So, it seems > more logical if this call were like so, no? > > if (cache_tree_update(istate->repo, istate, 0)) { > > In fact, in the world after 1fd9ae51 (repository: add repo reference > to index_state, 2021-01-23), it is dubious that this topic to teach > cache_tree_update() to take a repository pointer is sensible. While > working on a single repository, we may create multiple in-core index > instances that represent temporary indices, but each of these in-core > index instances (i.e. istate) belong to a single repository. > > And in a call to cache_tree_update(R, I, F), if I->repo is *NOT* R, > that must mean a bug. Here is what 1fd9ae51 says on this point. > > repository: add repo reference to index_state > > It will be helpful to add behavior to index operations that might > trigger an object lookup. Since each index belongs to a specific > repository, add a 'repo' pointer to struct index_state that allows > access to this repository. > > Add a BUG() statement if the repo already has an index, and the index > already has a repo, but somehow the index points to a different repo. > > This will prevent future changes from needing to pass an additional > 'struct repository *repo' parameter and instead rely only on the 'struct > index_state *istate' parameter. > > Derrick, what's you thought on this? The patch under discussion > looks to me a prime example of "future change(s)" needing "to pass > an additional 'struct repository *repo' parameter". With your additional comments, I think it is clear that the "fourth option" I mentioned earlier [1] is the way to go: Finally, there is yet a fourth option: use istate->repo instead. In 1fd9ae51 (repository: add repo reference to index_state), I added a 'repo' member to struct index_state. This is intended for methods to access a repository directly from the index. [1] https://lore.kernel.org/git/f187df01-8e59-ac74-01e1-586a7a63fd4e@xxxxxxxxx/ So in this sense, we should always use istate->repo, but we might still need the following guard in some places: if (!istate->repo) istate->repo = the_repository; in case there are situations where the index is loaded before the_repository is loaded. I have hit this in testing, but don't fully understand the cases where this can happen. The way it would change this patch is to drop the 'struct repository *r' pointers and changes to the method signatures. Instead, keep the methods only taking a 'struct index_state *istate' and use istate->repo everywhere. Thanks, -Stolee