Re: git clone of empty repositories doesn't preserve hash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 05, 2023 at 01:40:22PM -0700, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > Yeah, we send a special capability line in that case. If you do:
> >
> >   GIT_TRACE_PACKET=1 git clone a b
> >
> > you can see that upload-pack indicates that ls-refs understands the
> > "unborn" capability:
> >
> >   packet:  upload-pack> version 2
> >   packet:  upload-pack> agent=git/2.40.0.824.g7b678b1f643
> >   packet:  upload-pack> ls-refs=unborn
> >   packet:  upload-pack> fetch=shallow wait-for-done
> >   packet:  upload-pack> server-option
> >   packet:  upload-pack> object-format=sha256
> >   packet:  upload-pack> object-info
> >   packet:  upload-pack> 0000
> >
> > And then clone asks for it say "yes, I also understand unborn":
> >
> >   packet:        clone> command=ls-refs
> >   packet:        clone> agent=git/2.40.0.824.g7b678b1f643
> >   packet:        clone> object-format=sha256
> >   packet:        clone> 0001
> >   packet:        clone> peel
> >   packet:        clone> symrefs
> >   packet:        clone> unborn
> >   packet:        clone> ref-prefix HEAD
> >   packet:        clone> ref-prefix refs/heads/
> >   packet:        clone> ref-prefix refs/tags/
> >   packet:        clone> 0000
> >
> > And then upload-pack can send us the extra information:
> >
> >   packet:  upload-pack> unborn HEAD symref-target:refs/heads/maestro
> >   packet:  upload-pack> 0000
> >
> > I think we'd need to do something similar here for object-format.
> 
> I guess we only need to touch "git clone" then.  Without being
> asked, it advertsizes object-format=sha256 already, and when the
> maestro repository is prepared without --object-format=sha256,
> upload-pack advertises object-format=sha1 instead.  So it probably
> is just the matter of capturing it and using it to populate the
> extensions.objectformat with an appropriate value.

Ah, yeah, you're right. I was thinking that capability advertisement was
"by the way, I understand sha256". But I may just be showing my
ignorance of the current state of the hash-transition protocol
extensions. :)

I'm actually surprised this does not Just Work already. The client gets
the intended algorithm from that capability line, which is how we know
how to parse any ls-refs output at all (we are not just guessing from
the length). But we only bother to set it in the local config if we have
refs to fetch, which seems like a bug.

So the solution is maybe something like this:

diff --git a/builtin/clone.c b/builtin/clone.c
index 462c286274c..5eca95cb892 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1296,19 +1296,22 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 			clear_bundle_list(transport->bundles);
 			FREE_AND_NULL(transport->bundles);
 		}
 	}
 
-	if (mapped_refs) {
+	{
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 
 		/*
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
 		initialize_repository_version(hash_algo, 1);
 		repo_set_hash_algo(the_repository, hash_algo);
+	}
+
+	if (mapped_refs) {
 		/*
 		 * transport_get_remote_refs() may return refs with null sha-1
 		 * in mapped_refs (see struct transport->get_refs_list
 		 * comment). In that case we need fetch it early because
 		 * remote_head code below relies on it.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux