Re: clone breaks replace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Phillip Susi wrote:

> I managed to
> correctly add a replace commit that truncates the history and contains
> instructions where you can find it, and running git log only goes back
> to the replacement commit, unless you add --no-replace-objects, which
> causes it to show the original full history.

Before I get to your real question: this seems a bit backwards.  Let
me say a few words about why.

In the days before replacement refs (and today, too), each commit
name described not only the state of a tree at a moment but the
history that led up to it.  In fact you can see this somewhat directly:
given two distinct commits A and B if you try

	$ git cat-file commit A >a.commit
	$ git cat-file commit B >b.commit
	$ diff -u a.commit b.commit

then you will see precisely what can make them different:

 - the author's name and email and the date of authorship
 - the committer's name and email and the date committed
 - the names of the parent commits, describing the history
 - the name of a tree, describing the content
 - the log message, including its encoding

The commit name is a hash of that information (see git-hash-object(1))
and an invariant maintained is "if a repository has access to commit A,
it has access to its parents, their parents, and so on".  This invariant
is maintained during object transfer and garbage collection and relied
on by object transfer and revision traversal.

The beauty of replacement refs is that they can be easily added or
removed without breaking this invariant.  And a replacement ref is an
actual reference into history, so garbage collection does not remove
those commits and the repository keeps enough information to traverse
both the modified and unmodified history.

Therefore if you want clients to be able to choose between a minimal
history and a larger one to save bandwidth, it has to work like this

 - to get the minimal history, fetch _without_ any replacement refs
 - to get the full history, fetch the replacement refs on top of that.

because an additional reference can only increase the number of
objects to be downloaded.

> The problem is that when I clone the repository, I expect the clone to
> contain only history up to the replacement record, and not the old
> history before that.  Instead, the clone contains only the full original
> history, and the replacement ref is not imported at all.  A git replace
> in the new clone shows nothing.
>
> Shouldn't clone copy .git/refs/replace?

With that in mind, I suspect the best way to achieve what you are
looking for is the following:

 1. Make a big, ugly history (branch "big").  Presumably this part's
    already done.

 2. Find the part you want to get rid of and make appropriate
    replacement refs so "gitk big" shows what you want it to.

 3. Use "git filter-branch" to make that history a reality (branch
    "simpler").  Remove the replacement refs.

 4. Use "git replace" to graft back on the pieces you cauterized.
    Publish the result.

 5. Perhaps also run and publish "git replace big simpler", so
    contributors of branches based against the old 'big' can merge
    your latest changes from 'simpler'.  Encourage contributors to
    use 'git rebase' or 'git filter-branch' to rebase their
    contributions against the new, simpler history.

Does that make sense?

Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]