Re: GSOC remote-svn: branch detection

Dmitry Ivankov <divanorama@xxxxxxxxx> · Sat, 4 Aug 2012 12:40:18 +0600

Hi,

On Sat, Aug 4, 2012 at 12:17 AM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
> Hi,
>
> Florian Achleitner wrote:
>
>> Two approaches:
>> 1. Import linearly and split later:
>> One idea is to import from svn linearly, i.e. one revision on top of it's
>> predecessor, like now, and detect and split branches afterwards. The svn
>> metadata is stored in git notes, so the required information would be
>> available.
>> + allows recovery, because the linear history is always here.
This is a good one, but I'd put questions another way:
- do we want to query svn server only for newer revisions even if our
settings changed (branch layout ones for example), maybe we don't mind
some queries in settings change case (like git-svn.perl)?
- do we want to be able to filter svn history early (like take
trunk,branches,tags, skip tests_data as it's huge but sometimes there
are svn cp to/from it, or maybe the repo has weird permissions or even
is corrupted)?
- do we just want a completely separate (fast) (local) storage like
svn dump file to use it for imports and settings changes?

I personally still haven't decided on those. My set of pros/cons:
+ should be the simplest thing for simple small repos
+ keeps all the original data details and looks quite robust
- becomes complicated if we don't want or can't import some parts of
the history. While git-svn.perl somehow handles is.
- looks like a thing to store and access svn dump information, do we
really want it to be in a form of git objects (almost sure), how
stable, flexible, independent from svn helper should it be (that's
what Jonathan talks about).

Weird idea: what if we keep everything in one huge git tree like
rXX/{data,props,copy-from,..}/path/path/path/file. It should represent
all the known svn info so far. Ok, I know it's a late stage now and
this thing is completely raw, just posting to have it written out
somewhere :)

>> + it's easier to peek around in the git history than in the svn dump during
>> import to do the branch detection.
>> - requires creation of new commits in the branch detection stage.
>> - this results in double commits and awkward history, linear vs. branched.
>
> I don't think you've captured the real pros and cons here.
>
> + Divides responsibility between a component that fetches and a component
> that splits branches, making for easier debugging, independent refactoring
> of components, reuse in other contexts (e.g., splitting out branches in
> other similar VCSen, etc)
>
> - Divides responsibility between a component that fetches and a component
> that splits branches, which is tricky because it involves designing an
> interface between them and documenting it.  And maybe a different
> interface would be better.
>
> There are also performance and history-clarity ramifications as you've
> mentioned, but they do not seem as important.
>
> Hope that helps,
> Jonathan
>
>> 2. Split during import:
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html