On Mon, Sep 07, 2015 at 09:05:41AM +0800, Levin Du wrote: > > Instead, the object transfer is optimized by comparing what commits > > each side has and sending trees and blobs that are reachable from > > the commits that the receiving side does not have. > > The sender A sends all the commits that the receiver B does not have. > The commits contains trees and blobs. In my situation, branch in A has > only one commit. It seems that B has received lots of duplicate blobs, > concluded from the GC result. Right. B tells A "I already have this commit", but A does not already have it, so that information is not helpful. It cannot make any assumptions about what B has, and must send all trees and blobs referenced by its commit. > What I do not understand is, how duplicate blobs happen in a git repository? > Git repository is famous for its content addressing storage system. > I guess that A sends its packed file to B directly, no matter what are > already in B. Not exactly. During a push, git may or may not keep the packfile sent over the wire, depending on the number of objects in it and the receive.unpackLimit config setting. The same object can exist in two separate packfiles. One of the effects of "git gc" is to remove such duplicates. So A effectively does send its whole pack in this case, but only because it cannot find any shared history with B (and B keeps it as-is until the next gc because it is over the unpackLimit). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html