Hi Junio,
On Wed, 1 Mar 2006, Junio C Hamano wrote:
Interestingly enough, there are two levels of "rename tracking" the
current git does. Whey you run "git whatchanged -M", you are
looking at renames between each commit in the commit chain, one
step at a time. There as long as the rename+rewrite does not
amount to too much rewrite, you would see what should be detected
as rename to be detected as renames.
Right.
I found the current default threshold parameters to be about right,
maybe a bit too tight sometimes, though. If you want to loosen the
default, you can specify similiarity index after -M.
That's one option.
I'm wondering though if we couldn't also allow for users to
additionally encode naming 'hints', to aid this 'similarity'
detection process.
The way recursive merge strategy uses the rename detection, unlike
what whatchanged shows you, does not use chains of commits down to
the common merge base in order to detect renames (my recollection
may be wrong here -- it's a while since I looked at the recursive
merge the last time). It just looks at the two heads being merged,
and detects similarility between them. So it does not make _any_
difference with the current implementation of recursive merge if
you kept a history full of "honest but disgusting" commits or
collapsed them into a history with small number of "cleaned up"
commits.
I'm going to have to stare at this paragraph a lot longer and harder
to understand it :).
One thing it _could_ do (and you _could_ implement as another merge
strategy and call it "pauls-rename" merge) is to follow the commit
chain one by one down to the common merge base from both heads
being merged, and analyze rename history on the both commit chains.
Right, I was just thinking that while making tea actually. This could
be part of the 'collapsing' process. (or call it "coalesce
too-detailed commits" process if that is less offensive to ones sense
of process ;) ).
Actually, you're sort of suggesting following the chains in parallel,
right? Ie in wall-clock time order, rather than chain order. And
doing name resolution across the 'to-be-merged' chains at each step
of the way? Sort of a lesser subset of how other SCMs maintain state
for names globally?
It's not so much /resolving/ names I'm worried about in the first
place. It's there simply being no information in the first place to
indicate (from one single-parent commit to the next) which names were
renamed.
Then, you would get better rename+rewrite detection than what it
currently does.
But if I follow the commit chain in order to try extract
HOWEVER.
If you have that kind of rename-following merge, a workflow that
collapses a useful history into a single huge commit "Ok, this
commit is a roll-up patch between version 2.6.14 and 2.6.15"
becomes far less attractive than it currently already is. At that
point, you _are_ throwing away useful history.
Yes, I agree. And I am, as part of arguing git's case (several SCMs
are being evaluated and considered, I'm the git proponent at the
moment), I'm going to suggest workflow ought to be re-evaluated to
ensure it is generally reasonable, rather than be kept for the sake
of it keeping (particularly as it may be tailored to the
needs/limitations of $TRADITIONAL_SCM).
However, I suspect at least some level of collapsing will be desired
(just as it is with Linux and git).
The workflow issue is seperate from the 'impure rename' issue though,
even if the workflow I gave as an example excerbates the issue,
"rename and rewrite half of it" and hard-to-detect renames can still
occur in the detailed git/linux workflows, surely?
regards,
--
Paul Jakma paul@xxxxxxxx paul@xxxxxxxxx Key ID: 64A2FF6A
Fortune:
If you really knew C++, you wouldn't even joke about putting it
in the kernel.
- Richard Johnson on linux-kernel
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html