Re: [PATCH] negotiator/skipping: skip commits during fetch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 16 2018, Jonathan Tan wrote:

Didn't catch this until this was in next, sorry.

Re-arranged the diff a bit:

> -void fetch_negotiator_init(struct fetch_negotiator *negotiator)
> +void fetch_negotiator_init(struct fetch_negotiator *negotiator,
> +			   const char *algorithm)
>  {
> +	if (algorithm && !strcmp(algorithm, "skipping")) {
> +		skipping_negotiator_init(negotiator);
> +		return;
> +	}
>  	default_negotiator_init(negotiator);
>  }

Okey, I understand that's how it works now, but....

> +fetch.negotiationAlgorithm::
> +	Control how information about the commits in the local repository is
> +	sent when negotiating the contents of the packfile to be sent by the
> +	server. Set to "skipping" to use an algorithm that skips commits in an
> +	effort to converge faster, but may result in a larger-than-necessary
> +	packfile; any other value instructs Git to use the default algorithm
> +	that never skips commits (unless the server has acknowledged it or one
> +	of its descendants).
> +

...let's instead document that there's just the values "skipping" and
"default", and say "default" is provided by default, and perhaps change
the code to warn about anything that isn't those two.

Then we're not painting ourselves into a corner by needing to break a
promise in the docs ("any other value instructs Git to use the default")
if we add a new one of these, and aren't silently falling back on the
default if we add new-fancy-algo the user's version doesn't know about.

Also, switching gears entirely, I'm very excited about this whole thing
because it allows me to address something I've been meaning to get to
for a while.

At work I sometimes want to see what commits I've made to all our git
repos, for remembering what I was doing last February or whatever (this
is for filling in quarterly reports).

So I have this script that basically does this:

    for repo in $(get-list-of-all-the-things)
    do
        git config "remote.$repo.url" git@xxxxxxxxxxxxxxxxxxxxxx:$repo.git
        git config "remote.$repo.fetch" "+HEAD:$repo/HEAD"
        git config "remote.$repo.tagOpt" "--no-tags"
    done &&
    git fetch --all

I.e. for every repo like git/git I'll fetch its upstream HEAD as the
branch git/git/HEAD. Then I can do stuff like:

    git shortlog --author=Ævar --since=2018-02-01 --until=2018-03-01

Now, running that "git fetch --all" takes ages, and I know why. It's
because the in the negotiation for "git fetch some/small-repo" I'm
emitting hundreds of thousands of "have" lines for SHA1s found in other
unrelated repos, only to get a NAK for all of them.

One way to fix that with this facility would be to have some way to pass
in arguments, similar to what we have for merge drivers, so I can say
"just emit 'have' lines for stuff found in this branch". The most
pathological cases are when I'm fetching a remote that has one commit,
and I'm desperately trying to find something in common by asking if the
remote has hundreds of K of commits it has no chance of having.

Or there may be some smarter way to do this, what do you think?



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux