Taylor Blau <me@xxxxxxxxxxxx> writes: > On Mon, Oct 28, 2024 at 02:43:43PM +0100, Karthik Nayak wrote: >> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >> index 0800714267..2b2816c243 100644 >> --- a/builtin/pack-objects.c >> +++ b/builtin/pack-objects.c >> @@ -1529,7 +1529,7 @@ static int want_found_object(const struct object_id *oid, int exclude, >> return 0; >> if (ignore_packed_keep_in_core && p->pack_keep_in_core) >> return 0; >> - if (has_object_kept_pack(oid, flags)) >> + if (has_object_kept_pack(the_repository, oid, flags)) > > Do we want to use p->repo here instead of the_repository? I think the > answer is "yes" since in this function we are given a pack "p" and want > to determine if the given object contained in "p" is useful to pack. If > not, we want to search for it among other packs here, likely within the > same repository. > > (Again, probably a moot point here since this is all going to be > the_repository anyway, but just thinking aloud...). > I don't think it is a moot point at all. We do want to move up the layers and cleanup usage of global variables. Reducing the work required definitely gets us there faster. >> } >> >> @@ -3627,7 +3627,7 @@ static void show_cruft_commit(struct commit *commit, void *data) >> >> static int cruft_include_check_obj(struct object *obj, void *data UNUSED) >> { >> - return !has_object_kept_pack(&obj->oid, IN_CORE_KEEP_PACKS); >> + return !has_object_kept_pack(the_repository, &obj->oid, IN_CORE_KEEP_PACKS); > > Here we don't know what pack "obj" is contained in, which makes sense > since this is a traversal callback, not something that is iterating over > the contents of a particular pack or similar. So using the_repository is > right here. > > Although... should we be using to_pack->repo here over the_repository > (in builtin/pack-objects.c)? The rest of the code definitely does *not* > do that, but I think probably should. I think so too, I won't change existing code, but makes sense to do it in our patches. Will amend. > >> static int cruft_include_check(struct commit *commit, void *data) >> diff --git a/diff.c b/diff.c >> index dceac20d18..1d483bdf37 100644 >> --- a/diff.c >> +++ b/diff.c >> @@ -4041,7 +4041,8 @@ static int reuse_worktree_file(struct index_state *istate, >> * objects however would tend to be slower as they need >> * to be individually opened and inflated. >> */ >> - if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid)) >> + if (!FAST_WORKING_DIRECTORY && !want_file && >> + has_object_pack(the_repository, oid)) >> return 0; >> >> /* >> diff --git a/list-objects.c b/list-objects.c >> index 985d008799..31236a8dc9 100644 >> --- a/list-objects.c >> +++ b/list-objects.c >> @@ -41,7 +41,8 @@ static void show_object(struct traversal_context *ctx, >> { >> if (!ctx->show_object) >> return; >> - if (ctx->revs->unpacked && has_object_pack(&object->oid)) >> + if (ctx->revs->unpacked && has_object_pack(ctx->revs->repo, >> + &object->oid)) >> return; >> >> ctx->show_object(object, name, ctx->show_data); >> diff --git a/pack-bitmap.c b/pack-bitmap.c >> index 4fa9dfc771..d34ba9909a 100644 >> --- a/pack-bitmap.c >> +++ b/pack-bitmap.c >> @@ -1889,7 +1889,7 @@ static void filter_packed_objects_from_bitmap(struct bitmap_index *bitmap_git, >> bitmap_unset(result, i); >> >> for (i = 0; i < eindex->count; ++i) { >> - if (has_object_pack(&eindex->objects[i]->oid)) >> + if (has_object_pack(the_repository, &eindex->objects[i]->oid)) > > Interesting. I think the_repository in practice is fine here, but I > might have expected something like bitmap_git->p->repo, or the > equivalent for the MIDX case. > > So I was going to suggest something like: > > static struct repository *bitmap_repo(const struct bitmap_index *bitmap_git) > { > if (bitmap_is_midx(bitmap_git)) > return bitmap_git->midx->repo; > return bitmap_git->pack->repo; > } > > and then rewriting this as: > > if (has_object_pack(bitmap_repo(bitmap_git), &eindex->objects[i]->oid)) > > , but we can't do that, because the MIDX structure does not know what > repository it belongs to, only the object_dir it resides in! > Exactly, I agree it should be achieved from `struct bitmap_index`. Unfortunately we can't with the current state as you noted. > And I think that causes wrinkles earlier in your series that I didn't > think of at the time when reviewing, because it seems odd in retrospect > that, e.g. we have something like: > > load_multi_pack_index(the_repository->objects->odb->path, ...); > > where we pass in the object_dir path directly, but other functions like > prepare_midx_pack() that take in a 'struct repository *'. > > I wonder if we should be initializing the MIDX with a repository > pointer, so that it knows what repository it belongs to. I suspect that > we will still have to pass in a separate string indicating the > object_dir, likely because of the --object-dir quirk I mentioned > earlier. > > But my main thought here is that we should be able to infer from a > 'struct bitmap_index *' what repository it belongs to instead of using > 'the_repository' here directly. > I also think it makes sense to progress to the goal of removing global variables in a way where we primarily focus on a single file/subsystem at a time. And directionally between the bottom <> top levels. This patch series focuses on the `packfile.c` file, so we cleanup the file and remove associated usages of the global variable and try to also follow some form of cleanup as we go. But for other files, it is okay to still rely on the global variables. Slowly when the cleanup phase reaches those files, we can give our focus to those files. So here, it would be nice to have MIDX have a repository pointer too, but I think we'd be overshooting trying to refactor that in this series. So I'd leave it as is and focus on that when we get to cleaning up `pack-bitmap.c`. > The rest all looks quite reasonable to me. > > Thanks, > Taylor
Attachment:
signature.asc
Description: PGP signature