[PATCH] make 'git clone' ask the remote only for objects it cares about

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Current behavior of 'git clone' when not using --mirror is to fetch 
everything from the peer, and then filter out unwanted refs just before 
writing them out to the cloned repository.  This may become highly 
inefficient if the peer has an unusual ref namespace, or if it simply 
has "remotes" refs of its own, and those locally unwanted refs are 
connecting to a large set of objects which becomes unreferenced as soon 
as they are fetched.

Let's filter out those unwanted refs from the peer _before_ asking it 
what refs we want to fetch instead, which is the most logical thing to 
do anyway.

Signed-off-by: Nicolas Pitre <nico@xxxxxxxxxxx>
---

On Fri, 25 Sep 2009, Nicolas Pitre wrote:

> On Fri, 25 Sep 2009, Jason Merrill wrote:
> 
> > On 09/25/2009 04:47 PM, Nicolas Pitre wrote:
> > > Do you have access to the remote machine?  Is it possible to have a
> > > tarball of the gcc.git directory from there?
> > 
> > http://gcc.gnu.org/gcc-git.tar.gz
> > 
> > I'll leave it there for a few days.
> 
> Thanks, I got it now.  And I was able to reproduce the issue locally.
> 
> Cloning the original repository does transfer objects which become 
> unreferenced in the clone.  But cloning that cloned repository (before 
> pruning the unreferenced objects) does not transfer those objects again.  
> 
> Just need to find out why.

And the "why" is described above.  The problem was actually on the 
client side and was affecting clones of any repository containing 
anything outside refs/heads and refs/tags.

The fact that the git repository on gcc.gnu.org has lots of stuff in 
"remote" branches that don't get cloned by default is a separate 
configuration/policy issue on that server which might need (or not) to 
be looked into.  For instance at least, as a bare repository, it should 
have all the git files in gcc.git/ directly instead of gcc.git/.git/.

diff --git a/builtin-clone.c b/builtin-clone.c
index bab2d84..edf7c7f 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -329,24 +329,28 @@ static void remove_junk_on_signal(int signo)
 	raise(signo);
 }
 
-static struct ref *write_remote_refs(const struct ref *refs,
-		struct refspec *refspec, const char *reflog)
+static struct ref *wanted_peer_refs(const struct ref *refs,
+		struct refspec *refspec)
 {
 	struct ref *local_refs = NULL;
 	struct ref **tail = &local_refs;
-	struct ref *r;
 
 	get_fetch_map(refs, refspec, &tail, 0);
 	if (!option_mirror)
 		get_fetch_map(refs, tag_refspec, &tail, 0);
 
+	return local_refs;
+}
+
+static void write_remote_refs(const struct ref *local_refs, const char *reflog)
+{
+	const struct ref *r;
+
 	for (r = local_refs; r; r = r->next)
 		add_extra_ref(r->peer_ref->name, r->old_sha1, 0);
 
 	pack_refs(PACK_REFS_ALL);
 	clear_extra_refs();
-
-	return local_refs;
 }
 
 int cmd_clone(int argc, const char **argv, const char *prefix)
@@ -495,9 +499,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 	strbuf_reset(&value);
 
-	if (path && !is_bundle)
+	if (path && !is_bundle) {
 		refs = clone_local(path, git_dir);
-	else {
+		mapped_refs = wanted_peer_refs(refs, refspec);
+	} else {
 		struct remote *remote = remote_get(argv[0]);
 		transport = transport_get(remote, remote->url[0]);
 
@@ -520,14 +525,16 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 					     option_upload_pack);
 
 		refs = transport_get_remote_refs(transport);
-		if (refs)
-			transport_fetch_refs(transport, refs);
+		if (refs) {
+			mapped_refs = wanted_peer_refs(refs, refspec);
+			transport_fetch_refs(transport, mapped_refs);
+		}
 	}
 
 	if (refs) {
 		clear_extra_refs();
 
-		mapped_refs = write_remote_refs(refs, refspec, reflog_msg.buf);
+		write_remote_refs(mapped_refs, reflog_msg.buf);
 
 		remote_head = find_ref_by_name(refs, "HEAD");
 		remote_head_points_at =
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]