Re: Git 2.26 fetches many times more objects than it should, wasting gigabytes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King wrote:

> So it really does seem like something in v2 is not trying as hard to
> negotiate as v0 did, even when using stateless-http.
>
> I'm attaching for-each-ref output before and after the xo fetch. That
> should be sufficient to recreate the situation synthetically even once
> these repos have moved on.

Thanks again.  Looked a little closer and the advertised refs shouldn't
matter.  We just have two completely different code paths for
negotiation, one for v0 and one for v2.  In v0, we do

	fetch_negotiator_init(r, negotiator);
	mark_complete_and_common_ref(negotiator, args, &ref);
	filter_refs(argrs, &ref, sought, nr_sought);
	find_common(negotiator, args, &ref, sought, nr_sought);

find_common does

	count = 0;
	while ((oid = negotiator->next(negotiator)) {
		write "have %s\n", oid
		if (++count >= flush_at) {
			... flush ...
			flush_at = next_flush(args->stateless_rpc, count);
			...

In v2, the corresponding code is in add_haves:

	while ((oid = negotiator->next(negotiator))) {
		write "have %s\n", oid
		if (++haves_added >= *haves_to_send)
			break;
	}
	*in_vain += haves_added;
	if (!haves_added || *in_vain >= MAX_IN_VAIN)
		write "done"
	*haves_to_send = next_flush(1, *haves_to_send);

In other words, in v2 we reset the counter on each round whereas in v0
we keep a running total.  That would be expected to produce larger
negotiate requests in v2 than v0 (which looks like an unintentional
difference, but not the one producing this bug).

So much for flush_at handling.  in_vain seems like another area to
compare: in v2, the driving logic is in do_fetch_pack_v2:

	haves_to_send = INITIAL_FLUSH;
 negotiate_loop:
	while (!send_fetch_request(negotiator, fd, args, ref,
				   &common, &haves_to_send,
				   &in_vain, reader.use_sideband))
		switch (process_acks(negotiator, &reader, &common)) {
		case 1:
			in_vain = 0; /* renewed hope!
			continue;
		case 2:
			break negotiate_loop; /* time to move on */
		default:
			continue;
		}

When process_acks sees an ACK, it passes it on to the negotiator.
It wants to record that it received an ack to reset in_vain, but
it forgets to!  The function is initialized and read but never
written to.

So I'd expect the following to help:

diff --git i/fetch-pack.c w/fetch-pack.c
index 1734a573b01..a1d743e1f61 100644
--- i/fetch-pack.c
+++ w/fetch-pack.c
@@ -1287,6 +1287,8 @@ static int process_acks(struct fetch_negotiator *negotiator,
 			struct object_id oid;
 			if (!get_oid_hex(arg, &oid)) {
 				struct commit *commit;
+
+				received_ack = 1;
 				oidset_insert(common, &oid);
 				commit = lookup_commit(the_repository, &oid);
 				if (negotiator)



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux