Taylor Blau <me@xxxxxxxxxxxx> writes: > On Mon, Oct 28, 2024 at 02:43:39PM +0100, Karthik Nayak wrote: >> The struct `packed_git` holds information regarding a packed object >> file. Let's add the repository variable to this object, to represent the >> repository that this packfile belongs to. This helps remove dependency >> on the global `the_repository` object in `packfile.c` by simply using >> repository information now readily available in the struct. > > Makes sense, good. I think it would be useful here to capture some of > the discussion from just before you sent this series to indicate why > it's OK to use the_repository even when we have alternates. > > I think it is now quite obvious in retrospect, but let's do our future > selves a service by capturing it here, too ;-). > Agree, will add it in the next version. >> --- >> 10 files changed, 30 insertions(+), 16 deletions(-) > > Oh, good. I am glad to see that this new approach is already yielding > far less disruptive of a change. > >> diff --git a/builtin/fast-import.c b/builtin/fast-import.c >> index 76d5c20f14..ffee7d3abd 100644 >> --- a/builtin/fast-import.c >> +++ b/builtin/fast-import.c >> @@ -765,6 +765,7 @@ static void start_packfile(void) >> >> p->pack_fd = pack_fd; >> p->do_not_close = 1; >> + p->repo = the_repository; > > Makes sense. Here we are crafting the packfile by hand, so initializing > ->repo directly makes sense here. > > It would be nice if we could rewrite this in terms of > packfile.c:alloc_packed_git(), but that is a static function. Exposing > it as non-static is probably showing too much of the internals, so I > think leaving this as-is makes sense. > Yes, I did consider that too, but dropped it for the same reasons you stated. >> diff --git a/commit-graph.c b/commit-graph.c >> index 5bd89c0acd..83dd69bfeb 100644 >> --- a/commit-graph.c >> +++ b/commit-graph.c >> @@ -1914,7 +1914,7 @@ static int fill_oids_from_packs(struct write_commit_graph_context *ctx, >> struct packed_git *p; >> strbuf_setlen(&packname, dirlen); >> strbuf_addstr(&packname, pack_indexes->items[i].string); >> - p = add_packed_git(packname.buf, packname.len, 1); >> + p = add_packed_git(ctx->r, packname.buf, packname.len, 1); > > I wondered if ctx->r was the right choice here or not, but it is, and it > is (currently) always equal to the value of the_repository, so it's a > moot point. Let's keep going... > >> diff --git a/midx-write.c b/midx-write.c >> index b3a5f6c516..c57726ef94 100644 >> --- a/midx-write.c >> +++ b/midx-write.c >> @@ -154,7 +154,7 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, >> return; >> >> ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); >> - p = add_packed_git(full_path, full_path_len, 0); >> + p = add_packed_git(the_repository, full_path, full_path_len, 0); > > Ugh. I thought we had already added a repository field to our auxiliary > write_midx_context struct, but we have not, so this change looks right > to me. Doing so (adding that new field) seems like it would be a good > piece of #leftoverbits. > >> diff --git a/midx.c b/midx.c >> index e82d4f2e65..8edb75f51d 100644 >> --- a/midx.c >> +++ b/midx.c >> @@ -464,7 +464,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, >> strhash(key.buf), key.buf, >> struct packed_git, packmap_ent); >> if (!p) { >> - p = add_packed_git(pack_name.buf, pack_name.len, m->local); >> + p = add_packed_git(r, pack_name.buf, pack_name.len, m->local); > > OK, so here we're trusting the value of 'r' from the caller. That comes > from 64404a24cf (midx: pass a repository pointer, 2019-04-29), which is > doing the right thing. (As an aside, I thought that that change was from > when we added the --object-dir flag to 'git multi-pack-index', but the > change is in fact unrelated and has to do with adding installed packs to > the repository's MRU list). > >> if (p) { >> install_packed_git(r, p); >> list_add_tail(&p->mru, &r->objects->packed_git_mru); >> diff --git a/object-store-ll.h b/object-store-ll.h >> index 53b8e693b1..8b31072b09 100644 >> --- a/object-store-ll.h >> +++ b/object-store-ll.h >> @@ -4,6 +4,7 @@ >> #include "hashmap.h" >> #include "object.h" >> #include "list.h" >> +#include "repository.h" > > Hmm. Do we need to include all of repository.h here? I don't think we > do, because we never peek into any of the fields of that structure from > within this header. So I think you could do something like: > > --- 8< --- > diff --git a/object-store-ll.h b/object-store-ll.h > index 6f9f4276e6..bcfae2e1bf 100644 > --- a/object-store-ll.h > +++ b/object-store-ll.h > @@ -4,13 +4,13 @@ > #include "hashmap.h" > #include "object.h" > #include "list.h" > -#include "repository.h" > #include "thread-utils.h" > #include "oidset.h" > > struct oidmap; > struct oidtree; > struct strbuf; > +struct repository; > > struct object_directory { > struct object_directory *next; > --- >8 --- > > instead of #include-ing the whole thing, which would be preferable. > This is much better, I will patch it in. >> #include "thread-utils.h" >> #include "oidset.h" >> >> @@ -135,6 +136,10 @@ struct packed_git { >> */ >> const uint32_t *mtimes_map; >> size_t mtimes_size; >> + >> + /* repo dentoes the repository this packed file belongs to */ >> + struct repository *repo; > > Calling this 'repo' makes sense, but... > >> diff --git a/packfile.c b/packfile.c >> index 9560f0a33c..45f300e5e1 100644 >> --- a/packfile.c >> +++ b/packfile.c >> @@ -217,11 +217,12 @@ uint32_t get_pack_fanout(struct packed_git *p, uint32_t value) >> return ntohl(level1_ofs[value]); >> } >> >> -static struct packed_git *alloc_packed_git(int extra) >> +static struct packed_git *alloc_packed_git(struct repository *repo, int extra) > > Here and elsewhere, I think our usual convention is to call a 'struct > repository *' (when it is a formal parameter of some function) just "r" > instead of "repo". > > At least that's what my intuition told me, and a very rough grep says > that '*r' appears as a parameter 815 times, while '*repo' appears only > 577 times. It's close, but I think that '*r' is preferred here since > it's fewer characters. > I agree, by now you know I prefer readability over fewer characters, so it more of an intentional choice. But here, I think it can be '*r' though, since it is sort of obvious what 'r' refers to in most cases. I will change this in all commits in the next version. >> { >> struct packed_git *p = xmalloc(st_add(sizeof(*p), extra)); >> memset(p, 0, sizeof(*p)); > > Not at all the fault of this patch, but it feels like a bit of a > foot-gun to allocate a bounds-checked version of 'sizeof(*p)+extra', > while only zero'ing the first 'sizeof(*p)' bytes. I think in all cases > where it actually matters via add_packed_git() we fill out that extra > space anyway, but it might be nice cleanup to do something like: > > struct packed_git *p; > size_t sz = sizeof(*p) + extra; > > p = xcalloc(1, sz); > > , or something. But that can be dealt with later and/or as #leftoverbits. > Like you stated, there is bunch of refactoring we could do here, I mostly don't want to digress and create too much noise, so I will turn a blind eye in some cases and leave it for later. > The rest is looking good, nicely done. Let's keep reading... > > Thanks, > Taylor Thanks for the review.
Attachment:
signature.asc
Description: PGP signature