Re: git-svn with big subversion repository

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi John,

I've successfully run git svn clone on a repository with about 100k
revisions. The clone was not of the whole repository, but rather a
subdirectory for a project using the trunk/tags/branches structure.
The project is about 200k files and about 4GB.

The initial clone took hours and hours (on my macbook). I basically
had to leave it on over night (the svn server is here on the LAN,
running over https).

The only problem I had was that the clone would occasionally exit (not
stall, as you say). This is a know problem described here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=526989

The solution is to just run git svn fetch, as the cloning will pick up
where it stopped. To keep from having to do this yourself, loop the
fetch in a shell script. I blogged about it here:
<http://blog.tfnico.com/2010/07/living-with-subversion-and-git-in.html>

And there are also some more tricks and tips for living with git+svn
here: <http://www.tfnico.com/presentations/git-and-subversion>

You could also investigate how the Apache folks have made their git
mirrors here: http://git.apache.org/ - at least they have an SVN repo
with over a million revisions. I think they did something like
svn-dump + git fast-import, but I couldn't find any details on the
fly.


On Wed, Mar 2, 2011 at 3:43 AM, John Kristian <jkristian@xxxxxxxxxxxx> wrote:
> How do you recommend using git to work with branches of a large, busy
> subversion repository? In general, how can small teams use git for their
> tasks, and use subversion to coordinate with a larger organization?
>
> git-svn has some trouble, I find. For example, this tries to copy the entire
> repo starting with revision 1:
>
> git svn clone --stdlayout svn+ssh://server/repo/project
>
> This would take weeks, I estimate for my subversion repository.
>
> Choosing a subset of the repository enables git svn clone to cope, but then
> git svn fetch will stall after processing a few revisions.  For example:
>
> git svn clone --no-follow-parent --no-minimize-url \
>  --branches=branches \
>  --ignore-paths="^(?!branches/(TEAM_|RELEASE_))" \
>  -r $BASE svn+ssh://server/repo/project
> git svn fetch --no-follow-parent # stalls
>
> I don't why it stalls. I guess it's doing something that requires processing
> the entire subversion repository.
>
> The best I can do is clone each subversion branch into a separate svn-remote
> section of the .git/config file, for example:
>
> git svn clone --no-follow-parent --no-minimize-url \
>  --svn-remote=TEAM_FOO --id=TEAM_FOO \
>  -r $BASE svn+ssh://server/repo/project/branches/TEAM_FOO
> git svn fetch --no-follow-parent
>
> The clone runs about as long as svn checkout, and the fetch replays the
> later revisions briskly. Sadly, the relationship between branches isn't
> fetched: git log won't tell me how a given subversion branch was copied from
> another. I use svn for that.
>
> I'm using git version 1.7.4, git-svn version 1.7.4 (svn 1.6.5), svn version
> 1.6.0 (r36650) and Mac OS X version 10.6.5. I got git from MacPorts.
>
> - John Kristian
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]