Hi Linus,
On Wed, 1 Mar 2006, Linus Torvalds wrote:
The thing is, it does better than anything that _tries_ to be
"reliable".
I can pretty much _guarantee_ that you can't do it better.
I'm willing to take that argument to the 'project' concerned, I just
need to be pretty sure of it.
Tracking "inodes" - aka file identities - (which is what BK does,
and I assume what SVN does) is fundamentally problematic. I
particular, it's a horrible problem when two inodes "meet" under
the same name. You now have two identities for the same file, and
you're fundamentally screwed.
Yes, in that model it is. This interestingly, is not the BK model, I
suspect (see below).
It doesn't even need renames to be a problem. JUST THE FACT THAT
YOU TRY TO TRACK FILE "IDENTITY" HISTORY IS BROKEN.
If it's "file identity" globally across the lifetime of the project,
I agree 100% per cent. The 'traditional' SCM concerned does this.
That's not what a solution I'd want to explore either, I'm only
interested in the identity of files for any one /one/ commit. In
saying that, I recognise it's pointless to try annotate file-change
information in multi-parent commits (merges).
For example, take CVS, which doesn't actually try to do renames,
but _does_ try to track the identity of a file, since all the
history is tied into that identity: think about what happens in
Attic when a file is deleted. Completely broken model.
ACK, {Attic,deleted_files}/ is just horrid.
And that's really fundamental. CVS doesn't show the problems so
much, because CVS actively tries to make it hard to do these
things.
ACK.
With renames-tracking-file-identities, it's _really_ easy to get
some major confusion going. What happens when one branch creates a
file, and another one renames a file to that same name, and they
merge?
Well, the conflict has to be resolved somehow, even today.
Don't tell me it doesn't happen. It happened under BK. The way BK
"solved" it was to keep the two separate identities: one of them
got resolved to the new filename, the other one went into the
"deleted" directory.
Right. That's what the 'traditional workflow' SCM I'm thinking of
does - not BK funnily enough, but an SCM predating BK which also
happens to use SCCS files, and with some of the same high-level
push/pull constructs as BK (interestingly).
It also tracks name history globally using a deleted_files/ history,
which is maintained, but I don't think it does this for name merges
like the above.
In the one I'm thinking of, it does (I /think/, I'm not an expert in
it) the following:
Given two files, say:
'old:
1.1---1.2---1.3
new:
1.1
- constructs a 'fake' base SCCS revision, empty
- adds the top 'old' version as a branch
- adds the top new version as a new delta
1.1.1.1
/
1.1---------1.2
Where in the merged file:
1.1: empty
1.1.1.1: was 1.3 from 'old'
1.2: is 1.1 from 'new'
However, it does /not/ create a deleted_files entry for the 'old'
file. (AFAICT - I may not have a sufficiently full understanding of
this SCM)
Guess what happens when the side that got merged into "deleted"
continues to edit the file? That's right - their edits happen on
the deleted file, and never show up in the real tree in a
subsequent merge ever again.
Indeed - horrid.
And as far as I can tell, BK really did the best you can do.
Following file identities really _is_ fundamentally broken. It
sounds like a nice idea, but while you migth solve a few problems,
you create a whole raft of much more fundamental problems.
For tracking identity across more than one commit - I fully agree.
That's not what quite I'm thinking of though. Is it worth going on
with the discussion on a:
'track identities *only* from context of /the/ parent to
this commit'
So next time you think about a merge that migt have been improved
by tracking renames, please also think about a merge where one of
the filenames came from two or more different sources through an
earlier merge, and thank your benevolent Gods that they instructed
me to make git be based purely on file contents.
Oh, I agree muchely here.
I wouldn't change git. I only wonder if it give its rename-heuristics
an additional advisory-only hint? (for single-parent commits at least
- never merges - and only on a per-commit basis).
I probably should first explore how git deals with rename clashes..
regards,
--
Paul Jakma paul@xxxxxxxx paul@xxxxxxxxx Key ID: 64A2FF6A
Fortune:
I'm glad I was not born before tea.
-- Sidney Smith (1771-1845)
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html