On Tue, Mar 6, 2012 at 2:34 PM, Andrew Sayers <andrew-git@xxxxxxxxxxxxxxx> wrote: [snip] > On 06/03/12 19:29, Nathan Gray wrote: > <snip> >> >> The problem of specifying and detecting branches is a major problem in >> my upcoming conversion. We've got toplevel trunk/branches/tags >> directories but underneath "branches" it's a free-for-all: >> >> /branches/codenameA/{projectA,projectB,projectC} >> /branches/codenameB (actually a branch of projectA) >> /branches/developers/joe/frobnicator-experiment (also a branch of projectA) >> >> Clearly there's no simple regex that's going to capture this, so I'm >> reduced to listing every branch of projectA, which is tedious and >> error-prone. However, what *would* work fabulously well for me is >> "marker file" detection. Every copy of projectA has a certain file at >> it's root. Let's call it "markerFile.txt". What I'd really love is a >> way to say: > > This is quite close to the implementation I've got. The SVN exporter > runs in two stages: > > In the first stage, the script treats any non-blacklisted file as a > marker file, but only looks for trunk branches. It looks all through > the history, traces back through the copyfroms, and tries to find the > original directory associated with the file. Usually it decides that > the only branch without a copyfrom is /trunk. Searching just for trunks > with this weak heuristic makes it much easier to hand-verify the result. I'm not sure I understand. So if I have /trunk/projectA and /trunk/projectB then do I have to blacklist /trunk/projectB to extract only projectA's history? Assuming it's always lived there will your code detect /trunk/projectA as the "trunk?" Would it be possible to specify /trunk/projectA directly instead of blacklisting everything else? > In the second stage, the script looks through the history again, tracing > the copies of known branches in a slightly less clever way than > described in my previous e-mail. There's no need for marker files this > time round, as we just assume any `svn cp /trunk > /directory/not/within/a/branch` is a new branch. In my experiments this > has been a pretty solid way of detecting branches without too much human > input - I might be missing something (or have mis-explained something), > but I'd be interested to hear examples of where this would go wrong. That sounds pretty good, but it should probably also be transitive, i.e. `svn cp /any/known/branch/root /some/new/path` is also a new branch. Sometimes we'll spin off hotfix branches from release branches, for example. I'll have to give your code a try and see how it works. Cheers, -n8 -- http://n8gray.org -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html