On Mon, Oct 28, 2024 at 08:36:50AM -0500, karthik nayak wrote: > Jeff King <peff@xxxxxxxx> writes: > > > On Sun, Oct 27, 2024 at 05:23:24PM -0400, karthik nayak wrote: > > > >> While thinking about this over the last few days and also getting some > >> advice from Patrick, I realized that we don't need to be this disruptive > >> by simply adding the 'repository' variable to the already existing > >> 'packed_git' struct. This allows us to leverage this information more > >> easily, since most of the functions already have access to the > >> 'packed_git' struct. > >> > >> This, plus the series by Jeff 'jk/dumb-http-finalize' which also removes > >> some existing functions. We reduce the impact to only 3 functions being > >> modified. > > > > Yeah, I noticed while working on that topic that we were dropping some > > uses of the_repository. And FWIW I had the same notion, that packed_git > > should perhaps refer to the repository struct in which it resides. > > > > As Taylor noted this is a tiny bit weird with respect to alternates, > > which could exist in another repository (but don't have to! It could be > > a bare objects/ directory). But I think from the perspective of a > > particular process, we only have one repository struct that covers all > > of its alternates for the duration of this process. So it would be OK in > > practice. You might be able to get away with just storing a hash_algo > > pointer in packed_git, which would be less weird (and is enough for the > > bits I looked at, but perhaps not in the more general case). > > This was my thought as well regarding alternates. Also it should be > noted that currently we're using the_repository anyways, so we will be > in the same state as before. Makes sense. Thanks, both, for thinking through it together. > In a general case it seems more necessary to add the repo and not just > the hash_algo. Mostly because there are parts which require access to > the repository and also because some of my patches add config changes > which also require access to the repository. I could believe that ;-). I see that you posted some patches lower down in the thread, which I figure probably uncover some cases where we need more than just a pointer to the_hash_algo. But let's read on and see exactly what shakes out. > > Looking at odb_pack_name(), it will still need to take a repository > > struct, since we sometimes form it before having a packed_git. But for > > most calls, I suspect you could have an alternate function that takes a > > packed_git and uses both its "hash" member and the algorithm. > > I have four functions which would still need to take in a repository: > 1. for_each_packed_object > 2. has_object_pack > 3. has_object_kept_pack > 4. obd_pack_name That matches my own thinking, but perhaps there are others that neither of us are coming up with off the tops of our heads. I think that as long as for_each_packed_object() continues to call prepare_packed_git (which sets up all of our alternates) and we continue to consult the packed_git_mru cache, we should be OK. > > Anyway, just my two cents having worked in the area recently. > > > > -Peff > > Thanks for your input! Indeed. Thanks, both. Thanks, Taylor