Re: Trying to sync two svn repositories with git-svn (repost)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 14, 2009 at 02:35:18AM -0400, Avery Pennarun wrote:

> By far the sanest thing you could possibly do is to create a central
> "public" branch that contains all the common commits, then merge from
> that public branch to the site-specific branches, but never merge in
> the opposite direction.  In case you happen to make some changes on
> the site-specific branches that you want to share, you can just
> cherry-pick them; the resulting conflicts when merging back are likely
> to be fairly minor.  This would be entirely consistent with git's
> normal operations, and would be easy:
> 
>    git checkout public
>    git cherry-pick stuff   # as rarely as possible; do the work
> directly on public if you can
> 
>    git checkout svn-1
>    git merge --no-ff public
>    git svn dcommit
> 
>    git checkout svn-2
>    git merge --no-ff public
>    git svn dcommit
> 
> No criss-cross merges, no insanity, no question about whether it's correct.

Indeed, this looks pretty simple.  But AFAICS, this works only when
starting out with a virgin repository.  In my situation, public is
currently empty and have to be constructed from scratch by picking
from the privates.

So it seems I have to sync the privates in a first step and build the
public from that in a second step.

So here's my second plan:
1. instead of doing the cherry-picking in a single repository, it might
   be helpful to do it in separate repositories: one repository for each
   direction.  While there are still two remote svn repositories in each
   svn repository, there is no need for criss-cross anymore.  The flow
   of the data is in one direction and it seems (at least at first glance)
   I can use git-svn-rebase to get a linear history.
2. After the synchronization is done, I would merge the two repositories
   into a third one to create the public repository.  Since this will be
   a pure git environment, I hope that the problems that are caused svn's
   lack of merge support will vanish.
3. Once the public repository exists, create the privates based on that
   public.

Here's my first attempt for the first step:

  # setup a repository template for the synchronization and configure the
  # svn remotes
  mkdir -p svn-sync.templ
  (
    cd svn-sync.templ
    git svn init --stdlayout file:///svn/svn-1
    git config merge.stat true
    for remote in svn-1 svn-2; do
      git config svn-remote.$remote.url      file:///svn/$remote
      git config svn-remote.$remote.fetch    trunk:refs/remotes/$remote/trunk
      git config svn-remote.$remote.branches branches/*:refs/remotes/$remote/*
      git config svn-remote.$remote.tags     tags/*:refs/remotes/$remote/tags/*
      git svn fetch -R $remote
      git checkout -b $remote $remote/trunk
      git tag $remote-orig $remote
    done
    git gc
  )
  
  # now copy the template to create the repositories where the actual
  # synchronization will be done
  cp -a svn-sync.templ  to-svn-1
  cp -a svn-sync.templ  to-svn-2
  
  # move cherries from svn-1 to svn-2 in the to-svn-2 repository
  (
    cd to-svn-2
    git svn fetch svn-1
    git checkout svn-2
    [ pick cherries ]
    git svn dcommit
    git tag -f svn-1-lastmerge svn-1
  )
  
  # move cherries from svn-2 to svn-1 in the to-svn-1 repository
  (
    cd to-svn-1
    git svn fetch svn-2
    git checkout svn-1
    [ pick cherries ]
    git svn dcommit
    git tag -f svn-2-lastmerge svn-2
  )
  
  # time passes
  
  # Move new commits from svn-1 to svn-2
  (
    cd to-svn-2
    git checkout svn-1
    git svn rebase
    git checkout svn-2
    git svn rebase svn-1
    [ more cherries ]
    git svn dcommit
    git tag -f svn-1-lastmerge svn-1
  )
  
  # Move new commits from svn-2 to svn-1
  (
    cd to-svn-1
    git checkout svn-2
    git svn rebase
    git checkout svn-1
    git svn rebase svn-2
    [ more cherries ]
    git svn dcommit
    git tag -f svn-2-lastmerge svn-2
  )

At first glance, this seems to work.  But there's the drawback that I
have to keep track of what have been merged manually.  So there's
certainly room for improvement :)

> More as an academic exercise than anything, I did find a way that will
> let you do criss-cross merging of all changes on A and B.  I still
> don't *really* recommend you use it, because it's extremely error
> prone, and there are lots of places where you could get merge
> conflicts and then end up in trouble.  (The above simple method, in
> contrast, might get conflicts sometimes, but you can just fix them as
> you encounter them and be done with it.)
> 
> The script below demonstrates how to take branches remote-ab and
> remote-ac, and auto-pick their changes (as they happen) into a new
> (automatically managed) branch public.  Then it merges public back
> into each branch, while avoiding conflicts.  The magic itself happens
> in extract() and crossmerge().
> 
> If nothing else, this method makes the gitk output far more sane than
> the original method.  This is because it doesn't include the history
> of 'public' in the site-specific branches.  That was the fundamental
> flaw in the method I had identified originally.  You can trick that
> original method into working too, but it's stunningly complex.  This
> is much more sane, albeit still not really sane.

I will have to play a little bit with this script to get a better
understanding how it works.  But from the description, I got the
impression that it matches my (current) work flow pretty good:
Currently, initial changes are done in some private repository and
propagated to the other repositories from there.  The only exception
is that currently, there's no such thing as a "public" repository.

> P.S. Sorry for the mess.  I suppose I should have broken down and
> written (or asked for :)) a minimal test case earlier, as it quickly
> revealed the problem.

Oh, I have learned a lot in this thread.  And BTW: I _have_ tried to
write a minimal test case several times.  But I simply was not able to
reproduce the problems there.  The problems showed up only on the real
repositories.

Thank you very much Avery!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]