Re: GSOC remote-svn: branch detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Saturday 04 August 2012 23:53:58 Ramkumar Ramachandra wrote:
> Hi,
> 
> Florian Achleitner wrote:
> > 1. Import linearly and split later:
> I think this approach will be a lot less messy if you can cleanly
> separate the fetching component from the mapper.  Currently, svndump
> re-creates the layout of the SVN repository.  And the series you
> posted last week contains a patch that attaches a note with SVN
> metadata to each commit.  Do you have thoughts on how the mapping will
> take place?

The mapping itself is currently a black box for me, it's internals could be 
rather complex. It could get a function like is_branch_start, that is called 
with a node ctx and tells if this is likely to be the start of branch. The 
detected branches are stored and upcoming changes in the associated 
directories are mapped to a commit on a branch.
The detection of branch starts and the list of existing branches can be taken 
from whatever logic we want. So that's approx. the idea.

Currently I'm working on more basic preparations. I want to split the creation 
of commits and the creation of blobs in svndump.c.
This is necessary because fast import requires a branch name as an argument to 
the 'commit' command, and
currently a 'commit' command is started when a new revision is encountered in 
the svndump.
But to decide on which branch the commit should go, or even if it will be more 
than one commit, it is necessary to read all the nodes first.
To prevent buffering the node content, I want to replace the inline data format 
(currently used) by 'blob' commands.
While parsing the dump, every node change creates a blob command to feed the 
data immediately into fast-import while the node metadata (struct node_ctx) is 
stored at least until the revision ends. Then the blobs can be put on a linear 
master tree and other branch trees. The node metadata could also be read from 
notes, if remapping branches.
That's not so easy to do, because the current implementation mixes tree-
operations and blob-operations heavily, and relies on only one global 
node_ctx.

> 
> Ram

Flo
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]