following untracked parents in git-svn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric et al.,

While using git-svn to work with a repository with a very complex history I
discovered a very unfortunate behavior:

In general when a branch was derived (copied) from somewhere else git-svn
follows this parent branch and imports it.  If multiple branches do that
git-svn detects that the corresponding parrent branch already had been
imported and reuses the imported data.  Unfortunately when the parent
directory in the svn repository is not tracked as a branch in the svn-remote
section of the config file (for instance when it is just a subdirectory of a
tracked branch) this situation is no longer detected and this parent branch is
imported multiple times with the same result.  In a large repository this can
increase importing time drastically.

My analysis (as far as I understand the code) is that this is because the map
files in .git/svn are indexed by their ref name in the git repository.
Untracked branches are indexed by the name of their following branch ref name
followed by @XX where XX is the revision number of the branch point.
Obviously with that scheme the index name for two branches following a common
parent tree is different and thus an already imported tree is not correctly
detected.

My thoughts where now that this could potentially be fixed by not indexing
those map files by their ref name in the git repository but by their location
in the original svn repository.  Given that my understanding of the git-svn
code is not good enough to decide about all the consequences of such a design
change I'd like to ask you whether you think this change would be a good idea
or whether I might have overlooked a fundamental problem that makes it
impossible (or at least hard) to implement this idea.

Since my description of the problem might be a bit confusing without an
example I created a very small svn repository that shows this problem.  A svn
repository dump for it is attached.  When importing this repository using the
svn-remote section

[svn-remote "svn"]
	url = file:///dev/shm/x/svn1
	fetch = trunk:refs/remotes/trunk
	branches = branches/*:refs/remotes/*
	tags = tags/*:refs/remotes/tags/*

you will get the following behavior during the import:

$ git svn init -s file:///dev/shm/x/svn1
Initialized empty Git repository in /dev/shm/x/git2/.git/
$ git svn fetch
r1 = 7920f3e7e70c9bb9d8a7caf28830c7ed205c20c6 (refs/remotes/trunk)
	A	x/alpha
r2 = db7ad1b41f1d2ad18d198b9a80d2606b27557faf (refs/remotes/trunk)
	A	x/beta
r3 = a35cab9c510f66d96437f21ecb738c93e0c6b793 (refs/remotes/trunk)
Found possible branch point: file:///dev/shm/x/svn1/trunk/x => file:///dev/shm/x/svn1/branches/foo1, 2
Initializing parent: refs/remotes/foo1@2
	A	alpha
r2 = 5584693b5216dc1fa05f56455c67dfd61093ee43 (refs/remotes/foo1@2)
Found branch parent: (refs/remotes/foo1) 5584693b5216dc1fa05f56455c67dfd61093ee43
Following parent with do_switch
	A	beta
Successfully followed parent
r4 = d0cb7cfc1f69e52ecd39d8eb67518abe136b53d3 (refs/remotes/foo1)
Found possible branch point: file:///dev/shm/x/svn1/trunk/x => file:///dev/shm/x/svn1/branches/foo2, 2
Initializing parent: refs/remotes/foo2@2
	A	alpha
r2 = 5584693b5216dc1fa05f56455c67dfd61093ee43 (refs/remotes/foo2@2)
Found branch parent: (refs/remotes/foo2) 5584693b5216dc1fa05f56455c67dfd61093ee43
Following parent with do_switch
	A	beta
Successfully followed parent
r5 = 181cb81070b816bef74adefa1bc4c451100a5eef (refs/remotes/foo2)
Checked out HEAD:
  file:///dev/shm/x/svn1/trunk r3

As you can see file:///dev/shm/x/svn1/trunk/x is imported twice.  For this
small repository this is not a big issue but when this tree had a deep history
in a large repository you wanted to avoid that.

Robert

Attachment: svndump.gz
Description: GNU Zip compressed data

Attachment: pgpxfkITHpQIY.pgp
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]