Re: git-svn pulling down duplicate revisions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jun 2, 2008, at 3:42 AM, Eric Wong wrote:

Kevin Ballard <kevin@xxxxxx> wrote:
On Jun 1, 2008, at 10:40 PM, Eric Wong wrote:

Kevin Ballard <kevin@xxxxxx> wrote:
On Jun 1, 2008, at 10:00 PM, Eric Wong wrote:

Kevin Ballard <kevin@xxxxxx> wrote:
I started a git-svn clone on a large svn repository, and I noticed
that for various branches, it kept pulling down the exact same
revisions (starting at r1). In other words, if I had 4 branches
that
shared common history, their common history all got pulled down 4
times. I double-checked, and the created commit objects were
identical.

Why was git-svn pulling down the same revisions over and over, when
it
already knows it has a commit object for those revisions?

Can you give me an example if a repository and command-line you used that does this? Did you use 'git svn clone -s' or did you manually
specify the branch locations in the repo?

It could even be a lack of read permissions to the repository root
that would cause things like this.

The repository is, unfortunately, a private repo so I can't share it.
I used `git svn clone -s` to clone it. I have the SVN perl bindings
v1.4.4 (according to git svn --version).

I definitely have read permissions to the repo root. If I specify to
only fetch -r 12000:HEAD (there's 14000-odd revisions), it doesn't
pull down any duplicates, but when I let it start from the root, it
pulls down hundreds of duplicates for multiple branches.

Can you at least send me the 'svn log -v' output for that repo?
Feel free to leave out the actual log messages and munge the path
names if you can't expose that information.

I'll have to do it tomorrow when I'm at the office. How much log info
do you need? I can let it run until I see duplicate revisions (it's
pretty obvious, it starts over again from r1).

I'll need the revisions where branches were created from
the common ancestor (presumably trunk) and some revisions
before it.

For debugging problems with restricted repositories, it may be worth it to create a repository skeleton cloning tool that just reads the output
of 'svn log --xml -v' and recreates a new SVN repository with:

 * all log messages stripped

 * all new files are created with just a random string in them (to
   throw off rename detection on the git side)
   (except symlinks, see below)

 * all path components tokenized and each token replaced with
   a dictionary value.  Something like:

   @tmp = map { $tok{$_} ||= ++$i; $tok{$_} } split(/\//, $old_path);
   $new_path = join('/', @tmp);

   This way all copy history can be preserved

 * all modified files will just get a random byte appended to them

 * all committer names replaced with a dictionary value (similar to
   what is done to path components).

Isn't there a script somewhere that's supposed to do this? Do you know where it is?

Incidentally, I just checked and when I start the git-svn clone, it starts pulling down revisions for the branch 'css_refactor@1559' (odd branch name, but it claimed to find multiple branch points for this 'css_refactor' branch). My guess is when it starts working on the next branch, it doesn't view it as related to css_refactor and starts pulling down the revisions again even though those revisions actually belonged to trunk.

-Kevin

--
Kevin Ballard
http://kevin.sb.org
kevin@xxxxxx
http://www.tildesoft.com


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux