Re: [PATCH] fix simple deepening of a repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nicolas Pitre <nico@xxxxxxx> writes:

> If all refs sent by the remote repo during a fetch are reachable 
> locally, then no further conversation is performed with the remote. This 
> check is skipped when the --depth argument is provided to allow the 
> deepening of a shallow clone which corresponding remote repo has no 
> changed.
>
> However, some additional filtering was added in commit c29727d5 to 
> remove those refs which are equal on both sides.  If the remote repo has 
> not changed, then the list of refs to give the remote process becomes 
> empty and simply attempting to deepen a shallow repo always fails.
>
> Let's stop being smart in that case and simply send the whole list over
> when that condition is met.  The remote will do the right thing anyways.
>
> Test cases for this issue are also provided.
>
> Signed-off-by: Nicolas Pitre <nico@xxxxxxx>
> ---

Thanks.  The fix looks correct (as usual with patches from you).

But it makes me wonder if this logic to filter refs is buying us anything.

>  	for (rm = refs; rm; rm = rm->next) {
> +		nr_refs++;
>  		if (rm->peer_ref &&
>  		    !hashcmp(rm->peer_ref->old_sha1, rm->old_sha1))
>  			continue;
		ALLOC_GROW(heads, nr_heads + 1, nr_alloc);
		heads[nr_heads++] = rm;
	}

What is the point of not asking for the refs that we know are the same?

In other words, what breaks (not necessarily in the correctness sense, but
also in the performance sense) if we get rid of this filtering altogether?

Over a native protocol, we will tell the other end what we have and the
remote end will notice that we are asking for the same thing, so it won't
include unnecessary objects in the pack anyway.  When calling out to a
walker, we will also notice that the ref matches what we have already and
will not fetch anything from the remote.  Because the commit at the tip of
the ref that is already up-to-date is marked as COMPLETE, we will not even
locally have to walk the object chain to make sure things are connected.

I think the only thing that this possibly gains is if _everything_ is up
to date.  Then we won't need to make a call to transport->fetch() and it
would save a new connection.

But that optimization is already done long ago by the caller's
quickfetch() in the normal case.

Which makes me suspect that the real fix should be something a lot
simpler, like this (untested) patch.

diff --git a/transport.c b/transport.c
index f231b35..14dab81 100644
--- a/transport.c
+++ b/transport.c
@@ -1053,18 +1053,16 @@ const struct ref *transport_get_remote_refs(struct transport *transport)
 int transport_fetch_refs(struct transport *transport, const struct ref *refs)
 {
 	int rc;
-	int nr_heads = 0, nr_alloc = 0;
+	int nr_heads = 0;
 	const struct ref **heads = NULL;
 	const struct ref *rm;
 
-	for (rm = refs; rm; rm = rm->next) {
-		if (rm->peer_ref &&
-		    !hashcmp(rm->peer_ref->old_sha1, rm->old_sha1))
-			continue;
-		ALLOC_GROW(heads, nr_heads + 1, nr_alloc);
+	for (rm = refs; rm; rm = rm->next)
+		nr_heads++;
+	heads = xmalloc(nr_heads * sizeof(*heads));
+	nr_heads = 0;
+	for (rm = refs; rm; rm = rm->next)
 		heads[nr_heads++] = rm;
-	}
-
 	rc = transport->fetch(transport, nr_heads, heads);
 	free(heads);
 	return rc;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]