git-svn commit parents detection algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello.

I have several questions regarding git-svn parents detection algorithm for the 
newly created branch. It seems that I have found some bugs.

The most general question is: could you describe a parents detection algorithm 
in the git-svn?

1. When I try to fetch revision created like this:
svn cp trunk branches/branch1
svn merge http://host/path/branches/branch2 branches/branch1
svn commit

svn:mergeinfo property is set but git-svn tries to do "svn switch"'s low level 
analog if it's possible. svn:mergeinfo property is not received in this case 
(revealed by setting a breakpoint at change_dir_prop in git-svn script) though 
it's present ("svn propget" shows this property). But if "svn switch" 
operation is not available in the perl svn client library simple "svn update" 
operation is used. I didn't check but suspect that then svn:mergeinfo would be 
processed and git-svn's find_extra_svn_parents would find addidtional parents. 
But I thought that parents set shouldn't depend on the svn version (or at 
least newer versions of svn library that support "svn switch" shouldn't make 
behaviour worse).

My opinion is that svn:mergeinfo should be checked in this case and merge 
parents be detected and added to the whole parents set.

2. Is it true that "find_extra_svn_parents" requires a reference to a list of 
exactly one parent as 4th argument?

If @$parents is empty in the function "git rev-list" and "git merge-base" 
calls in it would fail because of bad syntax. If @$parents has 2 or more 
elements than merge-base call would have unpredictable behaviour.

$merge_base = command_oneline(
				"merge-base",
				@$parents, $merge_tip,
			);
			
Git merge-base command finds a merge-base between the first commit and any of 
other commits listed as arguments. So if @$parents is (a, b) "git merge-base" 
will find common ancestor between "a" and "b" or "a" and $merge_tip that is 
nonsense. 

More over if the "find_extra_svn_parents" call requires exactly one parent and 
"make_log_entry" function in git-svn is always called with one-element parents 
list then finding additional parents in "find_extra_svk_parents" method that  is 
called before "find_extra_svn_parents" would cause fail of finding extra parents 
in  "find_extra_svn_parents".

Maybe these lines should be rewritten as 

$merge_base = command_oneline(
				"merge-base",
				$merge_tip, @$parents
			);
This makes sense in the case of several parents.

3. What's the purpose of "parents_exclude" git-svn's function?

It is called on reference to already found parents list and @commits that are 
found as "merge tips" of svn:mergeinfo property. But note that svn:mergeinfo 
value lines are listed *alphabetically*. For example:
/branches/branch1:4
/branches/branch2:3-4,8
/trunk:2,7
disregarding their reverse chronological order(!). So the @commits list would 
contain hashes that corresponding to (branch1, r4), (branch2, r8), (trunk, r7) 
in this "alphabetical" order (note: alphabetical order depends on branch 
naming).

Let us consider another example: we have commits in @commits: a, b, c, d, e, 
f, g (in this order) and $@parents consists of commit "h" . And suppose for 
simplicity that no commit of a, b, c, d, e, f, g is reachable from "h" (than 
"--not h" will have no sense). Suppose the commits have the following reverse 
chronological order e, c, f, g, a, b, d.

Then do {} while () in the "parents_exclude" will have 3 iterations:

"git rev-list -1 a b c d e f g --not h" for this case will return simply the 
latest 
commit (e). Then inner cycle will set @new to the slice @new = {a, b, c, d} // 
f, g are rejected because they are after "e". We add "e" to @excluded and 
repeat

`git rev-list -1 a b c d --not h` will return c as latest from these. 
New value of @new will become {a, b} and @excluded {e, c}

`git rev-list -1 a b --not h` will return a.
@new will become empty and @excluded will contain {e, c, a}.

Personally I don't know why these commits are better than others.

Moreover the order of {e, c, a} is not used later in the 
"find_extra_svn_parents" function.

I have two ideas about parents_exclude's purpose:
1) it lists a set of all commits not reachable from parents;
2) it lists a set of all commits not reachable from parents in *chronological* 
order.

To achieve 1) the "last;" operator should be removed from "parents_exclude" 
function.
To achieve 2) the "last;" operator should be removed from "parents_exclude" 
function and the function result should be assigned to @merge_tips list in 
"find_extra_svn_parents" function (without  "map { $_ => 1 }" construction) to 
save @excluded commits reverse chronological *order*.

As for me 2) is the most reasonable because then fetched svn repository 
wouldn't depend on branching naming.

If parents in @excluded shouldn't be reachable from another parents (that is 
true) then

my @cmd = ('rev-list', "-1", @commits, "--not", @$parents);

should be replaced into

my @cmd = ('rev-list', "-1", @commits, "--not", @$parents, @excluded );

Maybe this will make useless the check done in the doulbe cycle at the end of 
"find_extra_svn_parents":

	for ( my $i = 0; $i <= $#new_parents; $i++ ) {
			for ( my $j = 0; $j <= $#new_parents; $j++ ) {

4. The "parents_exclude" function pushes "undef" value to @merge_tips (in the 
case of bad $tip_commit) list and then excludes them by "grep" and "next 
unless $merge_tip and $excluded{$merge_tip};" line. So pushing "undef" is 
useless operation, it shouldn't perplex.

5. In the function "do_fetch" the last commit (my $lc = $self->last_commit) is 
*not* put to the parents list for the "find_extra_svn_parents" function in the 
case when "copyfrom_path parent" from "find_parent_branch" is found (it is 
added later then) but is put otherwise. Is it done intentionally or by a 
mistake?

6. Could you please tell in a few word "linearize_history" function's goals? 
In all cases I tested the commits list returned by it was the same as its 
input hash list. I tried to imagine the situation when these lists differ.

Let 
commit c1 has parent c2
commit c2 has parents c3 and c4
commit c3 has parent c4
commit c4 has parent c5

In this case "linearize_history" for (c1, c2, c3, c4, c5) list would return 
(c1, c2, c3) but I can't reproduce this situation by "git" command (I will try 
using git low-level library later).
Additional question: is it true that "git svn dcommit" will loose commits c4 
and c5 in this strange case?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]