On 3/4/2022 6:43 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > >> From: Derrick Stolee <derrickstolee@xxxxxxxxxx> >> >> In order to have a valid pack-file after unbundling a bundle that has >> the 'filter' capability, we need to generate a .promisor file. The >> bundle does not promise _where_ the objects can be found, but we can >> expect that these bundles will be unbundled in repositories with >> appropriate promisor remotes that can find those missing objects. > > That sounds like a lot of wishful thinking, but I do not think of a > better way to phrase the idea. Taking a bundle out of a repository > and unbundling it elsewhere is "git fetch" that could be done to > send objects from the former to the latter repository, so I am OK > with the assumption that the original repository will stay available > for such users who took its contents over sneaker-net instead of > over the wire. As an aside, I'm also concerned about the existing model of promisor remotes where it depends on each remote, and isn't a repository-wide state. In particular, if I do a blobless partial clone of git/git and then add git-for-windows/git as a remote and fetch it, it will break because git-for-windows/git isn't set up as a promisor remote and we expect to have every blob reachable from its pack-file (even though it was not sent because we advertised a commit that can reach it). I've been thinking about adjusting the config parsing around promisors to say "I see one promisor remote, so I will assume all remotes are promisors." It seems to me that this will fix cases like the above without further breaking any cases (that are not already broken). But that's a tangent for another time. :) >> Use the 'git index-pack --promisor=<message>' option to create this >> .promisor file. Add "from-bundle" as the message to help anyone diagnose >> issues with these promisor packs. >> >> Signed-off-by: Derrick Stolee <derrickstolee@xxxxxxxxxx> >> --- >> bundle.c | 4 ++++ >> t/t6020-bundle-misc.sh | 8 +++++++- >> 2 files changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/bundle.c b/bundle.c >> index e284ef63062..3d97de40ef0 100644 >> --- a/bundle.c >> +++ b/bundle.c >> @@ -631,6 +631,10 @@ int unbundle(struct repository *r, struct bundle_header *header, >> struct child_process ip = CHILD_PROCESS_INIT; >> strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL); >> >> + /* If there is a filter, then we need to create the promisor pack. */ >> + if (header->filter) >> + strvec_push(&ip.args, "--promisor=from-bundle"); >> + >> if (extra_index_pack_args) { >> strvec_pushv(&ip.args, extra_index_pack_args->v); >> strvec_clear(extra_index_pack_args); >> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh >> index 39cfefafb65..344af34db1e 100755 >> --- a/t/t6020-bundle-misc.sh >> +++ b/t/t6020-bundle-misc.sh >> @@ -513,7 +513,13 @@ do >> The bundle uses this filter: $filter >> The bundle records a complete history. >> EOF >> - test_cmp expect actual >> + test_cmp expect actual && >> + >> + # This creates the first pack-file in the >> + # .git/objects/pack directory. Look for a .promisor. >> + git bundle unbundle partial.bdl && >> + ls .git/objects/pack/pack-*.promisor >promisor && >> + test_line_count = 1 promisor > > OK. Do we also want to inspect the contents of the resulting > repository to make sure that the bundle had the right contents? > > One idea to do so would probably be > > - prepare a test repository (you already have it) > - prepare a partial.bdl (you already do this) > > - clone the test repository into a new repository, with the same > filter > - create an empty repository, unbundle the partial.bdl > > - take "for-each-ref" and list of objects available in these two > "partial copies" from the test repository, and compare Good idea. Thanks! Of course, looking closer at it... "git bundle unbundle" doesn't actually store the refs directly in the refspace, but instead only outputs the refs that it used. Here is an attempt to verify the refs that are reported match those in a mirror clone. --- >8 --- diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh index 344af34db1e..a228cbfc4e3 100755 --- a/t/t6020-bundle-misc.sh +++ b/t/t6020-bundle-misc.sh @@ -490,7 +490,7 @@ test_expect_success 'unfiltered bundle with --objects' ' for filter in "blob:none" "tree:0" "tree:1" "blob:limit=100" do test_expect_success 'filtered bundle: $filter' ' - test_when_finished rm -rf .git/objects/pack && + test_when_finished rm -rf .git/objects/pack cloned unbundled && git bundle create partial.bdl \ --all \ --filter=$filter && @@ -515,11 +515,22 @@ do EOF test_cmp expect actual && - # This creates the first pack-file in the - # .git/objects/pack directory. Look for a .promisor. - git bundle unbundle partial.bdl && - ls .git/objects/pack/pack-*.promisor >promisor && - test_line_count = 1 promisor + git init unbundled && + ( + cd unbundled && + # This creates the first pack-file in the + # .git/objects/pack directory. Look for a .promisor. + git bundle unbundle ../partial.bdl >ref-list.txt && + ls .git/objects/pack/pack-*.promisor >promisor && + test_line_count = 1 promisor + ) && + + git clone --filter=blob:none --mirror "file://$(pwd)" cloned && + git -C cloned for-each-ref \ + --format="%(objectname) %(refname)" >cloned-refs.txt && + echo "$(git -C cloned rev-parse HEAD) HEAD" >>cloned-refs.txt && + test_cmp cloned-refs.txt unbundled/ref-list.txt ' done --- >8 --- I also attempted doing a "git clone --bare partial.bdl unbundled.git" to get the 'git clone' command to actually place the refs. However, 'git clone' does not set up the repository filter based on the bundle, so it reports missing blobs (even though there is no checkout). Making this work would require that "repository global promisor config" idea that I mentioned in another reply. I'll make note of this as a potential application of that idea. Thanks, -Stolee