Kevin Ballard <kevin@xxxxxx> wrote: > On Jun 1, 2008, at 10:40 PM, Eric Wong wrote: > > >Kevin Ballard <kevin@xxxxxx> wrote: > >>On Jun 1, 2008, at 10:00 PM, Eric Wong wrote: > >> > >>>Kevin Ballard <kevin@xxxxxx> wrote: > >>>>I started a git-svn clone on a large svn repository, and I noticed > >>>>that for various branches, it kept pulling down the exact same > >>>>revisions (starting at r1). In other words, if I had 4 branches > >>>>that > >>>>shared common history, their common history all got pulled down 4 > >>>>times. I double-checked, and the created commit objects were > >>>>identical. > >>>> > >>>>Why was git-svn pulling down the same revisions over and over, when > >>>>it > >>>>already knows it has a commit object for those revisions? > >>> > >>>Can you give me an example if a repository and command-line you used > >>>that does this? Did you use 'git svn clone -s' or did you manually > >>>specify the branch locations in the repo? > >>> > >>>It could even be a lack of read permissions to the repository root > >>>that would cause things like this. > >> > >>The repository is, unfortunately, a private repo so I can't share it. > >>I used `git svn clone -s` to clone it. I have the SVN perl bindings > >>v1.4.4 (according to git svn --version). > >> > >>I definitely have read permissions to the repo root. If I specify to > >>only fetch -r 12000:HEAD (there's 14000-odd revisions), it doesn't > >>pull down any duplicates, but when I let it start from the root, it > >>pulls down hundreds of duplicates for multiple branches. > > > >Can you at least send me the 'svn log -v' output for that repo? > >Feel free to leave out the actual log messages and munge the path > >names if you can't expose that information. > > I'll have to do it tomorrow when I'm at the office. How much log info > do you need? I can let it run until I see duplicate revisions (it's > pretty obvious, it starts over again from r1). I'll need the revisions where branches were created from the common ancestor (presumably trunk) and some revisions before it. For debugging problems with restricted repositories, it may be worth it to create a repository skeleton cloning tool that just reads the output of 'svn log --xml -v' and recreates a new SVN repository with: * all log messages stripped * all new files are created with just a random string in them (to throw off rename detection on the git side) (except symlinks, see below) * all path components tokenized and each token replaced with a dictionary value. Something like: @tmp = map { $tok{$_} ||= ++$i; $tok{$_} } split(/\//, $old_path); $new_path = join('/', @tmp); This way all copy history can be preserved * all modified files will just get a random byte appended to them * all committer names replaced with a dictionary value (similar to what is done to path components). -- Eric Wong -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html