On 11/26/06, Marko Macek <marko.macek@xxxxxxx> wrote:
Jon Smirl wrote: > > SVN hides the mini branch by creating a symbol like this: > > Symbol XXX, change set 70 > copy All from change set 50 > copy file A from change set 55 > copy file B,C from change set 60 > copy file D from change set 61 > copy file E,F,G from change set 63 > copy file H from change set 67 > > It has to do all of those copies because the change sets weren't > constructed while taking symbol dependency information into account. > > Symbol XXX can't copy from change set 69 because commits from after > the symbol was created are included in change sets 51-69. Sometimes it is not actually possible to have a 'simple' symbol, even by following proper symbol dependencies. Some situations: - tags on some files are readjusted later, or tagged separately with an older version - tag is created with a -D "date" and the file times are not in sync - tag is created from a mixed-revision working copy
I agree that there are a few exceptions to making simple symbols. But the current cvs2svn makes no attempt at all to preserve simple symbols. In my attempts at converting Mozilla 60% of the symbols ended up as tiny branches. I investigated a couple by hand and was able to rearrange things to create simple symbols in every case I looked at. This can be dealt with during the topological sort. If there are complex symbol creations you will end up with loops during the sort process. At that point you need to start breaking up change sets to remove the loops. You would use a heuristic at this point, something like try breaking up to ten commit change sets to preserve a symbol, if you can't preserve it with 10 breaks then break the symbol once and try again, repeat until the loop is gone. The current cvs2svn code effectively implements a heuristic when the commits are always preserved at the expense of breaking the symbols. Since some commit comments are very common comments (blank ones) those commits get combined into bigger change sets and trash the simple symbols. Another note for doing a converter. When combining things into change sets, for git import the comments in the branches should not be mixed between branches and the trunk when detecting change set. Git doesn't allow simultaneous commits to the trunk and branches.
While in the cases of 'time warp' the revision sequence should be considered more important than timestamps, this is not necessarily true for tags, since it's easily possible to create them on mixed revisions. cvs2svn also has a problem with vendor branches because it creates tags/branches that contain files from vendor branch by copying some files from the trunk and other files from the vendor branch. If the vendor branch/tag was only used for the initial import, it's IMO best to skip them in the conversion (this needs a patch). There are however problems because keyword expansion causes file differences. It seems that mozilla CVS repository has vendor branches/imports in some parts of the tree.
I never got around to checking out problems with vendor branches in Mozilla.
Mark
-- Jon Smirl jonsmirl@xxxxxxxxx - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html