Re: refs/replace advice

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



j.sixt@xxxxxxxxxxxxx wrote on Fri, 29 Jul 2011 17:49 +0200:
> Am 7/29/2011 17:31, schrieb Pete Wyckoff:
> > I'm trying to use "git replace" to avoid cloning the entire set
> > of duplicate commits across a slow inter-site link.  Like this:
> > 
> >     ...---A----B----C   site1/top
> >                      \
> >                       D---E---F  site1/proj
> > 
> >     ...---A'---B'---C'  site2/top
> > 
> > It is true that "git diff C C'" is empty:  they are identical.
> ...
> > I thought maybe I could "git fetch --depth=N" where N would cover
> > the range A'..site2/top, then replace.  But testing with "git
> > fetch --depth=3" still wants to fetch 100k objects.
> 
> On site2, don't you want to 'git fetch --depth=N site1' such that F down
> to at least C (but not much more) is fetched, and then apply the graft or
> replacement on site2?

Yes, that makes sense, shallow clone needs to pull the entire tree.

On site1 (bare .git repo):

    $ du -sm .
    542     .
    $ git merge-base site1/proj site1/top
    ff016f956ccae7878a1b322ba950a0088c6e2ded  ;# this is A
    $ git rev-list ff016f956ccae7878a1b322ba950a0088c6e2ded | wc
	566     566   23206

On site2:

    $ du -sm .git
    649     .git
    $ git rev-parse :/1384557
    0f95d91c37bc870d610b7bd45b316ab219750d31  ;# this is A'
    $ git rev-list 0f95d91c37bc870d610b7bd45b316ab219750d31 | wc
	566     566   23206

Same number of commits all the way back to the beginning of time,
but the timestamp in the root commit is different, so all the SHA1s
are different.

On site2:

    $ time git fetch git://site1/repo
    warning: no common commits
    remote: Counting objects: 124166, done.
    remote: Compressing objects: 100% (64472/64472), done.
    remote: Total 124166 (delta 59815), reused 121350 (delta 57062)
    Receiving objects: 100% (124166/124166), 462.31 MiB | 5.31 MiB/s, done.
    Resolving deltas: 100% (59815/59815), done.
    From git://site1/repo
     * branch            HEAD       -> FETCH_HEAD
    0m56.25s user 0m5.18s sys 2m29.45s elapsed 41.11 %CPU

A brand new repo on site2, cloning this time with a teensy depth:

    $ time git fetch --depth=3 git://site1/repo
    warning: no common commits
    remote: Counting objects: 96440, done.
    remote: Compressing objects: 100% (58844/58844), done.
    remote: Total 96440 (delta 36454), reused 92650 (delta 35169)
    Receiving objects: 100% (96440/96440), 415.87 MiB | 7.38 MiB/s, done.
    Resolving deltas: 100% (36454/36454), done.
    From git://site1/repo
     * branch            HEAD       -> FETCH_HEAD
    0m40.40s user 0m5.27s sys 1m50.29s elapsed 41.41 %CPU

No savings in data transport.

Was hoping it would be possible to get just the changes, but walking
back to FETCH_HEAD~3 shows that it imports all the files.  That makes
sense given the use case for shallow clone.  But I want to tell the
fetch machinery that I already have one of the commits it is going to
see.

I'll just tell people to put up with the full copy, and try to fix
things so that only one site creates the git repo from p4 in the future.
Thanks for looking,

		-- Pete
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]