Re: rebase-with-history -- a technique for rebasing without trashing your repo history

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2009.08.14 23:21:01 +0200, Michael Haggerty wrote:
> Björn Steinbrink wrote:
> > On 2009.08.14 00:39:48 +0200, Michael Haggerty wrote:
> >> Björn Steinbrink wrote:
> >>> On 2009.08.13 14:46:07 +0200, Michael Haggerty wrote:
> > [...]
> > Doing a plain "git rebase subsystem topic" would of course also try to
> > rebase the "o" commits, so that problematic. Instead, you do:
> > 
> > git rebase --onto subsystem O topic
> > 
> > That turns O..topic (the * commits) into patches, and applies them on
> > top of O'. So the "o" commits aren't to be rebased.
> > 
> > And that's exactly what your rebase-with-history would do as well. Just
> > that O is naturally a common ancestor of subsystem and topic, and so
> > just using "git rebase-w-h subsystem topic" would be enough. Conflicts
> > etc. should be 100% the same.
> > 
> > If you know that your upstream is going to rebase/rewrite history, you
> > can tag (or otherwise mark) the current branching point of your branch,
> > so you can easily specify it for the --onto rebase. IOW: This is
> > primarily a social problem (tell your downstream that you rebase this or
> > that branch), but having built-in support to store the branching point
> > for rebasing _might_ be worth a thought.
> 
> Recording branch points manually, coordinating merges via email -- OMG
> you are giving me flashbacks of CVS ;-)

Not merging, but rewriting history. One of the primary purposes of
rebasing is to forget the old history, the new version overrides it. And
telling someone to forget something is a social problem. You can help
the user to forget the history by tracking the branching points and I
said that git could maybe learn to do that, so the user doesn't have
to do so. Quick idea:

On branch creation, create refs/bases/<branchname> (let's call that
<base>) referencing the commit the branch initially references.

On rebase, check if <branchname>..<onto> is not empty. If so, update
refs/bases/<branchname> to reference <base>.

On reset, check if the commit the branch head is being reset to is
reachable through the commit the branch head currently references. If
not, update <base> to reference the commit we're resetting to.

Find some sane syntax for rebase that implicitly uses <base> as the
<upstream> argument, e.g. just "git rebase --onto <whatever>" could work
as "git rebase --onto <whatever> <base>".

Most likely, I missed a bunch of corner cases though...

> *Of course* you can get around all of these problems if you put the
> burden of bookkeeping on the user.  The whole point of
> rebase-with-history is to have the VCS handle it automatically!

What your approach does, is simply moving the "just forget the
history" part. Instead of forgetting it at rebase time, you have to
forget it when you want to submit patches. It's obviously a bit easier
though, as you can just say "--first-parent <upstream>", assuming that
you teach format-patch to use a special first-parent diff mode for the
merge commits (see below).

> >> and merging in a topic branch makes it more difficult to create an
> >> easily-reviewable patch series.  rebase-with-history has neither of
> >> these problems.
> > 
> > Sure, merging is a no-go if you submit patches by email (or other,
> > similar means). But you compared that to an "enhanced" rebase approach,
> > instead of comparing your rebase approach to the currently available
> > one.
> 
> In [1] I compared rebase-with-history with both of the
> currently-available options (rebase and merge).  Rebase and merge can
> each deal with some of the issues that come up, but each one falls flat
> on others.  I believe that rebase-with-history has the advantages of both.

And some disadvantages.

1) Cluttered history, which needs to be rewritten again when the emailed
patches are just for review, but the maintainer will actually merge from
you later.

Taking the old master, subsystem, topic example, you get (for example):

          o2--o2 (subsystem)
         /     \
m---m---m---m---m (master)
     \   \
      \   o'--o'
       \ /   / \
        o---o   *'--*' (topic)
             \ /   /
              *---*

Now the user that maintains "topic" is back at the hard case. He now
needs to rebase onto master, using the last o' as <upstream>. The
DAG doesn't help here, the base-tracking would handle that.


2) Merge commits, which are usually displayed in a special format. So
for "git show" or "git log -p" to give useful output for those special
merges, you'd have to introduce a new "diff only against first-parent"
mode, and mark those merge in a special way, so that diff mode is used
for them, but not for real merges. And users of old git versions would
have to deal with the basic -m merge diff mode, ignoring the useless
diff for the second parent and the fact that the real merges also get
shown in that format. The base tracking doesn't have this problem
either.


> The example in [2] was taken straight from the git-rebase man page [3];
> I did not want to claim that current practice would use merging in this
> situation, but rather just to show that rebase-with-history removes the
> pain from this well-known example.

Well, the man pages says: Don't merge, rebase needs to trickle down, but
you'll likely need to use "git rebase --onto subsystem subsystem@{1}".
So the rebase-with-history really just saves that "use --onto and the
right <upstream>" from the hard case. The plain base-tracking does the
same.

Another way to reach the same goal would be just to explictly override
the old history.

m---m---m (master)
     \
      o---o (subsystem)
           \
            *---* (topic)

(Hypothetical): git rebase --override master subsystem

Leads to:

m---m---m---- (master)
     \       \
      o---o---O---o'--o' (subsystem)
           \
            *---* (topic)

Where O is an --ours merge, that just marks the old o commits as merged,
but has the same tree as the last m commit.

Now topic can be rebase using: git rebase --override subsystem topic

m---m---m---- (master)
     \       \
      o---o---O---o'--o' (subsystem)
           \           \
            *---*-------X---*'--*' (topic)

Again, X being an ours merge.

As the O and X commits have the last o and * commits as their second
parents, this even doesn't break things like "git show" and "git log
-p", as the interesting commits aren't merge commits. So "git
log -p --first-parent subsystem..topic" would do the right thing
(optionally with --no-merges to avoid the merge commit, but seeing that
doesn't hurt that much I guess).

This also trivially supports the reorder, squash, edit whatever stuff,
as it doesn't rely on 1:1 commit counterparts to exist. But it also
falls flat on its face as soon as subsystem gets "really" rewritting, so
that the old history is no longer reachable from the new history.

Björn
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]