Re: Cleaning up history with git rebase

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 31 Jul 2011 14:20:37 -0300, Ricky Egeland wrote:

> I've succeeded at breaking apart this big repository using
> `git filter-branch`, but where I am failing is the cleanup of
> the history of these new sub-repositories. The original big
> repository was used for years in a CVS-like fashion, with about
> 20 or so developers doing a pull/edit/pull/push workflow using a
> centralized shared repository. Most developers were working on
> unrelated components, so merge conflicts were rare, but there are
> some exceptions to that. The end result is that there are a lot
> of merge commits in big-repo.git, and in the case of my split
> sub-repositories these merge commits still appear in the history,
> even for merges which did not involve files that end up in a
> given repository. In most cases, there are more merge commits in
> the history than there are commits that actually affected the
> code that is left in these sub-repositires. I really want to
> clean this up.


Maybe you could use a more sophisticated filter-branch script to
examine merge commits and split them up or throw them out as
necessary.


> git rebase $(git rev-list --reverse HEAD | head -n 1)


As an aside, I would have expected to be able to limit the
`rev-list' output directly as in the following:

  git rebase $(git rev-list -1 --reverse HEAD)

but it doesn't work; when `-1' is passed, rev-list ignores
`--reverse', which I think is a bug.


> Which I take to mean "rebase this repository from the root to the
> current HEAD". In many cases it works perfectly, resulting in
> a short, clean history that only pertains to the files left in
> the new sub-repository. But some of the more actively developed
> components are problematic, as `git rebase` starts runs into
> conflicts and becomes interactive, and it is simply too tedious
> to use the interactive mode to resolve these problems. I've found
> a recipe for resolving these conflicts:


I don't understand why there would even be conflicts; can you give
a concrete example so that I might be able to understand the
scenario better?


>
>  - git status
>  - # look for files with problems like "both modified", or
>    "both added", set $CONFLICTFILE
>  - git checkout --theirs $CONFLICTFILE
>  - git add $CONFLICTFILE
>  - git commit -m 'Fixing conflict during rebase' $CONFLICTFILE


As an aside, putting `$CONFLICTFILE' on the `git commit' command
line seems to be redundant; from `git help commit':
  
  When files are given on the command line, the command commits
  the contents of the named files, without recording the changes
  already staged. The contents of these files are also staged for
  the next commit on top of what have been staged before.


>  - git rebase --continue
>  - # look for message like "did you forget to add..." if so, use --skip
>  - git rebase --skip
>  - # repeat as often as necessary
>
> For some of my sub-repositories this recipe did exactly what I
> wanted after repeating only a couple times. However, some of my
> sub-repositories have been forcing me to repeat this more than 50
> times, and I grew tired and started to look for ways to automate
> this. In essence, I want a non-interactive `git rebase`.
>
> To that end I upgraded my version of git to 1.7.4 and tried
> (without really understanding what these were doing):
>
> 1. git rebase -s recursive -X theirs \
>      $(git rev-list --reverse HEAD | head -n 1)
>
> 2. git rebase -s recursive -X ours \
>      $(git rev-list --reverse HEAD | head -n 1)
>
> 3. git rebase -s ours $(git rev-list --reverse HEAD | head -n 1)
>
> Method 1 and 2 were still interactive and stopped at conflicts.
> Method 3 was automatic but left me with the sub-repository at the
> state of the root commit... the opposite of what I want.

The results of Method 3 are predicted by `git help rebase' page:

  Because git rebase replays each commit from the working branch
  on top of the <upstream> branch using the given strategy, using
  the ours strategy simply discards all patches from the <branch>,
  which makes little sense.

In any case, how would you know which side to take in general?
It seems to me that you need to do something different at the
filter-branch step. Why are there conflicts anyway?

Sincerely,
Michael Witten
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]