Re: git-fetch pulls already-pulled objects?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> What you are expecting _could_ be implemented by exchanging all
> tree and blob objects sending and receiving sides have and computing
> the set difference, but the sender and the receiver do not exchange
> such a huge list.

In my case, I only want to exchange the tree object hash pointed directly
by the commit object; I don't care about all subtrees and blobs reachable
from the commit. I think a naive approach would only double the number of
hashes sent worst case.

Would negotiating the tree object hashes be possible on the client without
server changes? Is the protocol that flexible?


If what I want is *not* possible, is it possible to explicitly put a tree
(and its descendants) into its own pack? I think that will speed up the
git-fetch a bit by doing this on the server. (I know what trees/commits
will be sent ahead of time.) (The server does less work pulling the
objects out of an existing pack and repacking them for the client. (Or
maybe my mental model of git packs is wrong?))

> The object transfer is done by first finding the common ancestor of
> histories of the sending and the receiving sides, which allows the
> sender to enumerate commits that the sender has but the receiver
> doesn't.  From there, all objects [*1*] that are referenced by these
> commits that need to be sent.

Thanks for clarifying.

> *1* There is an optimization to exclude the trees and blobs that can
> be cheaply proven to exist on the receiving end.

That makes sense (especially for 'git revert HEAD' situations).

Thank you for your reply, Junio.

-----Original Message-----
From: Junio C Hamano <gitster@xxxxxxxxx>
Date: Thursday, October 29, 2015 at 10:32 AM
To: Matt Glazer <strager@xxxxxx>
Cc: "git@xxxxxxxxxxxxxxx" <git@xxxxxxxxxxxxxxx>
Subject: Re: git-fetch pulls already-pulled objects?

>Matt Glazar <strager@xxxxxx> writes:
>
>> On a remote, I have two Git commit objects which point to the same tree
>> object (created with git commit-tree).
>
>What you are expecting _could_ be implemented by exchanging all
>tree and blob objects sending and receiving sides have and computing
>the set difference, but the sender and the receiver do not exchange
>such a huge list.
>
>The object transfer is done by first finding the common ancestor of
>histories of the sending and the receiving sides, which allows the
>sender to enumerate commits that the sender has but the receiver
>doesn't.  From there, all objects [*1*] that are referenced by these
>commits that need to be sent.
>
>
>[Footnote]
>
>*1* There is an optimization to exclude the trees and blobs that can
>be cheaply proven to exist on the receiving end.  If the receiving
>end has a commit that the sending end does *not* have, and that
>commit happens to record a tree the sending end needs to send,
>however, the sending end cannot prove that the tree does not have to
>be sent without first fetching that commit from the receiving end,
>which fails "can be cheaply proven to exist" test.
>

��.n��������+%������w��{.n��������n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]