Re: [PATCH] contrib/svn-fe: Fast script to remap svn history

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

David Barr wrote:

> This python script walks the commit sequence imported by svn-fe.
> For each commit, it tries to identify the branch that was changed.
> Commits are rewritten to be rooted according to the standard layout.

I like the idea and especially that the heuristics are simple.

Maybe this could be made git-agnostic using the new ls-tree command
you are introducing in fast-import?  Though it would need to get a
revision list from somewhere.  Alternatively, do you think it would
make sense for something like this to be implemented as a filter or
observer of the fast-import stream as it is generated during an
import?

> A basic heuristic of matching trees is used to find parents for the
> first commit in a branch and for tags.

More precisely, the rule used is:

> +    # Find a common path prefix in the changes for the revision
> +    subroot = ""
> +    changes = Popen(["git","diff","--name-only",parent,git_commit], stdout=PIPE)
> +    for path in changes.stdout:
> +        match = subroot_re.match(path)
> +        if match:
> +            subroot = match.group()
> +            changes.terminate()
> +            break

The first change lying in one of

	trunk
	branch/*
	tags/*

determines the branch.  When a branch is renamed, this has a 50/50
chance of choosing the right branch.

> +        # Choose a parent for the rewritten commit
> +        if ref in ref_commit:
> +            parent = ref_commit[ref]
> +        elif subtree in tree_commit:
> +            parent = tree_commit[subtree]
> +        else:
> +            parent = ""

If this is a live branch, the parent is the last commit from that
branch.  Otherwise, we take the last commit whose resulting tree
looked like this one.  Or...

> +            # Default to trunk if the branch is new
> +            if parent == "" and "refs/heads/trunk" in ref_commit:
> +                parent = ref_commit["refs/heads/trunk"]

... if all else fails, we take the tip commit on the trunk.

For comparison, here's the git-svn rule:

> 	# look for a parent from another branch:
> 	my @b_path_components = split m#/#, $self->{path};

Among the paths above this commit's base directory [if this is
branches/foo, examine first branches/foo, then branches, then /]:

> 	while (@b_path_components) {
> 		$i = $paths->{'/'.join('/', @b_path_components)};
> 		last if $i && defined $i->{copyfrom_path};
> 		unshift(@a_path_components, pop(@b_path_components));
> 	}
> 	return undef unless defined $i && defined $i->{copyfrom_path};

Find the first one with copyfrom information (i.e., that was
renamed or copied from another rev in this revision).

> 	my $branch_from = $i->{copyfrom_path};
> 	if (@a_path_components) {
> 		print STDERR "branch_from: $branch_from => ";
> 		$branch_from .= '/'.join('/', @a_path_components);
> 		print STDERR $branch_from, "\n";
> 	}

Build back up the URL (so if branches was renamed to Branches but
branches/foo had no copyfrom information, we look for Branches/foo).

[...]
> 	my $gs = $self->other_gs($new_url, $url,
> 		                 $branch_from, $r, $self->{ref_id});
> 	my ($r0, $parent) = $gs->find_rev_before($r, 1);

Find the last revision that changed that path and record it.

Maybe we could benefit from including the copyfrom information in the
fast-import stream output by svn-fe somehow?  The simplest way to do
this would be some specially formatted comments.  An alternative (in
the spirit of Sam's earlier suggestions) might be to represent it in
the tree svn-fe creates, for example by introducing dummy

	foo.copiedfrom

symlinks.

Thanks, that was interesting.
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]