Re: challenges using fast-import and svn

"Shawn O. Pearce" <spearce@xxxxxxxxxxx> · Mon, 2 Jul 2007 18:24:00 -0400

David Frech <nimblemachines@xxxxxxxxx> wrote:
> So I wrote, in Lua, a parser for the (terrible) svn dump file format
> that feeds commands into fast-import. The parser took a day and a half
> to write; the fast-import backend took about an hour. ;-)

Heh.  That's about what most folks say.  ;-)

> However, there are issues. I don't currently track branch copies
> correctly, so branches start out with no history, rather than the with
> the history of the branch they are copied from; and handling deletes
> is tricky.

Branches are easy to create from the right branch in fast-import,
but its hard with the SVN dump file to know where it starts from.

One trick folks have used in the past is to assign a mark in
fast-import for each SVN revision.  Marks are very cheap and make
it easy to reference a commit in a from command when you need to
make a new branch.  You can just use the SVN revision number you
get from the SVN dump file.

> Here is the problem: if a file or directory is deleted in svn, the
> dumpfile shows simply this:
> 
> Node-path: trunk/project/file-or-directory
> Node-action: delete
> 
> In the case of a file, I can simply feed a "D" command to fast-import;
> but if I'm deleting a whole directory, my code knows nothing about
> what files exist in that directory. Is fast-import smart about this?
> Will it barf if given a directory argument rather than a file for "D"
> commands?

I just read the code again.  You can delete an entire subdirectory
just by sending a D command for that subdirectory, assuming you
don't end the name with a '/'.  So you should be able to just do:

  D file-or-directory

and whatever file-or-directory is, it goes away.  If you were to
send a trailing '/':

  D file-or-directory/

its likely bad things will happen because fast-import will try to
remove the file or directory named "" (yes, empty string) in the
subdirectory called "file-or-directory" but leave the subdirectory.

Another option is you can replace a tree with a file at any point in
time, without first deleting it.  So you could also just overwrite
the entire subdirectory with an empty file via the M command, then
delete the file.  But that shouldn't be necessary as the D command
should already do exactly what you want it to do.  Internally
the "replace entire directory with single file" is the same
implementation as the "delete entire directory" implementation...

So I guess this means a documentation update for the D command
would be a good idea?

> I could cache the directory contents in my code, but isn't that partly
> what fast-import is good for?

Yes.  fast-import is really quite good at helping you do Git side
of the equation.  ;-)

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html