Hi Sam, Sam Vilain writes: > On Thu, 2010-06-24 at 13:07 -0500, Jonathan Nieder wrote: > > operation. In other words, it needs the tree for > > http://path/to/some/svn/root/branches@r11. This does not correspond > > to a single git tree, since the content of each branch has been given > > its own commit. > > I wrote at length about this near the beginning of the project; > essentially, figuring out whether particular paths are roots or not is > not defined, as SVN does not distinguish between them (a misfeature > cargo culted from Perforce). It becomes a data mining problem, you have > this scattered data, and you have to find a history inside. Right. Implementing git-svn on top of git-remote-svn might not be a bad idea. > As I recommended before, it probably makes more sense to keep a "remote > tracking" branch which mirrors the *entire* repository, and sort out > efficient ways to convert SVN revision paths like the above into tree > IDs. > > I consider it very important to separate the data import and tracking > stage from the data mining stage. We're following this approach. At the moment, we're just focusing on getting all the data directly from SVN into the Git store. Instead of building trees for each SVN revision, we've found a way to do it inside the Git object store: we're currently ironing out the details, and I'll post an update about this shortly. > Once the data mining stage is well solved, then it makes sense to look > at ways that a tracking branch which only tracks a part of the > Subversion repository can be achieved. In the simple case, where no > repository re-organisation or cross-project renames have occurred it is > relatively simple. But in general I think this is a harder problem, > which cannot always be solved without intervention - and so not > necessary to be solved in short-term milestones. As you are > discovering, it is a can of worms which you avoid if you know you always > have the complete SVN repository available. Right. I'm not convinced that it necessarily requires user intervention though: can you systematically prove that enough information is not available without user intervention using an example? Or is it possible, but simply too difficult (and not worth the effort) to mine out the data? -- Ram -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html