On 2/25/25 1:14 PM, Junio C Hamano wrote:
"Scott Chacon via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
Furthermore, when I bundled just a tag (thinking it would have most
reachable objects) it completely failed to work because there were no
refs/heads/ available for negotiation - so it downloaded a huge file and
then still started from scratch on the fetch.
Nice finding.
This is an interesting case that I had not considered in the original
implementation.
The intention of the design is to avoid having the bundle URI fetch
changing tag refs, especially annotated tags. Those tag updates are
expected to be advertised in the "git fetch" output. It would probably
be best to peel the tag refs to a commit and then create a fake branch
for the bundle.
Now I'm only getting an extra 43k objects, so 1% of the original total,
and the entire operation is a bit faster as well.
That makes all sense.
I'm not sure if there is a downside here, it seems clearly how you would
want the negotiation to go. It ends up with way more refs under
refs/bundle (now there is refs/bundle/origin/master, etc) but that's
being polluted by the head refs anyhow, right?
I am not sure what you mean by "being polluted by the head refs
anyhow", but we should be equipped to deal with a repository with
tons of local branches, so having the comparable number of
remote-tracking branches instead of local branches now exposed in
refs/bundle/remotes/* hierarchy should work equally as well, or we
would have something we need to fix. So in principle I do not see
a problem with this approach.
The mapping used to be "refs/heads/foo" to "refs/bundles/foo", but
now "refs/heads/foo" is mapped to "refs/bundles/heads/foo", so that
you would presumably be mapping "refs/remotes/origin/master" to
"refs/bundles/remotes/origin/master", right? I hope existing users
are *not* looking at their resulting refs/bundles/ hierarchy and
somehow assuming the original mapping.
This is not something this "fix" changes, but unbundle_all_bundles()
apparently is prepared to handle more than one bundles. I wonder
what happens when multiple bundles expose the same branch pointing
at different objects? The way I read unbundle_from_file() is that
a new one overwrites the previous value, so even though we may have
all unbundled many bundles, the objects from earlier bundles may
lose their anchors, subject to be garbage-collected.
Imagine creating a bundle with two refs, refs/heads and
refs/remotes, and append that bundle as the last bundle of a bunch
of bundles full of local and remote-tracking branches, that have
populated refs/bundles/ hierarchy with tons of refs. Now the last
bundle is unbundled, and these two phoney refs would nuke everything
that used to be under refs/bundles/heads/* and refs/bundles/remotes/*
left by unpacking previous bundles, right?
Is this a reasonable change?
This is mostly Stolee's design, IIRC, so I have CC'ed; the work is
mostly from 53a50892 (bundle-uri: create basic file-copy logic,
2022-08-09) that is more than 2 years ago.
The biggest question I had (and tried to get ahead of on the PR) is
the use of a test to demonstrate what kind of bundle files cause this
issue. It would be important to demosntrate that the repo is still
usable if "refs/bundles/tags/v1.0" exists and points to a tag object.
- if (!skip_prefix(refname->string, "refs/heads/", &branch_name))
+ if (!skip_prefix(refname->string, "refs/", &branch_name))
continue;
So I'm OK with relaxing this to be more flexible, but I'm not sure
why the bundles couldn't be created using "refs/heads/", possibly via
changing the ref names during bundle creation.
By skipping only "refs/", "branch_name" is no longer a branch name,
which may be something we may want to fix, but if we were to address
the "names from later bundles seem to overwrite names used by
previous boundles, and unanchor objects obtained from them" problem,
I suspect we'd rather want to create new and unique names there,
without assuming or relying on that the names used by these bundles
are reasonably unique, so this part of the code may have to change
anyway, so we may not care too deeply at this point.
Thanks,
-Stolee