Re: impure renames / history tracking

Paul Jakma <paul@xxxxxxxx> · Wed, 1 Mar 2006 21:25:08 +0000 (GMT)

Hi Junio,

On Wed, 1 Mar 2006, Junio C Hamano wrote:

Interestingly enough, there are two levels of "rename tracking" the 
current git does.  Whey you run "git whatchanged -M", you are 
looking at renames between each commit in the commit chain, one 
step at a time.  There as long as the rename+rewrite does not 
amount to too much rewrite, you would see what should be detected 
as rename to be detected as renames.

Right.

I found the current default threshold parameters to be about right, 
maybe a bit too tight sometimes, though.  If you want to loosen the 
default, you can specify similiarity index after -M.

That's one option.

I'm wondering though if we couldn't also allow for users to 
additionally encode naming 'hints', to aid this 'similarity' 
detection process.

The way recursive merge strategy uses the rename detection, unlike 
what whatchanged shows you, does not use chains of commits down to 
the common merge base in order to detect renames (my recollection 
may be wrong here -- it's a while since I looked at the recursive 
merge the last time).  It just looks at the two heads being merged, 
and detects similarility between them.  So it does not make _any_ 
difference with the current implementation of recursive merge if 
you kept a history full of "honest but disgusting" commits or 
collapsed them into a history with small number of "cleaned up" 
commits.

I'm going to have to stare at this paragraph a lot longer and harder 
to understand it :).

One thing it _could_ do (and you _could_ implement as another merge 
strategy and call it "pauls-rename" merge) is to follow the commit 
chain one by one down to the common merge base from both heads 
being merged, and analyze rename history on the both commit chains.

Right, I was just thinking that while making tea actually. This could 
be part of the 'collapsing' process. (or call it "coalesce 
too-detailed commits" process if that is less offensive to ones sense 
of process ;) ).

Actually, you're sort of suggesting following the chains in parallel, 
right? Ie in wall-clock time order, rather than chain order. And 
doing name resolution across the 'to-be-merged' chains at each step 
of the way? Sort of a lesser subset of how other SCMs maintain state 
for names globally?

It's not so much /resolving/ names I'm worried about in the first 
place. It's there simply being no information in the first place to 
indicate (from one single-parent commit to the next) which names were 
renamed.

Then, you would get better rename+rewrite detection than what it 
currently does.

But if I follow the commit chain in order to try extract

HOWEVER.

If you have that kind of rename-following merge, a workflow that 
collapses a useful history into a single huge commit "Ok, this 
commit is a roll-up patch between version 2.6.14 and 2.6.15" 
becomes far less attractive than it currently already is.  At that 
point, you _are_ throwing away useful history.

Yes, I agree. And I am, as part of arguing git's case (several SCMs 
are being evaluated and considered, I'm the git proponent at the 
moment), I'm going to suggest workflow ought to be re-evaluated to 
ensure it is generally reasonable, rather than be kept for the sake 
of it keeping (particularly as it may be tailored to the 
needs/limitations of $TRADITIONAL_SCM).

However, I suspect at least some level of collapsing will be desired 
(just as it is with Linux and git).

The workflow issue is seperate from the 'impure rename' issue though, 
even if the workflow I gave as an example excerbates the issue, 
"rename and rewrite half of it" and hard-to-detect renames can still 
occur in the detailed git/linux workflows, surely?

regards,
--
Paul Jakma	paul@xxxxxxxx	paul@xxxxxxxxx	Key ID: 64A2FF6A
Fortune:
If you really knew C++, you wouldn't even joke about putting it
in the kernel.

	- Richard Johnson on linux-kernel
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html