Re: VCS comparison table

Linus Torvalds <torvalds@xxxxxxxx> · Mon, 23 Oct 2006 10:29:53 -0700 (PDT)

On Sun, 22 Oct 2006, Matthew D. Fuller wrote:
> 
> > This special treatment influences or directly causes many of the
> > things in bzr that we've been discussing:
>   [...]
> > I've been arguing that all of these impacts are dubious. But I can
> > understand that a bzr user hearing arguments against them might fear
> > that they would lose the ability to be able to see a view of commits
> > that "belong" to a particular branch.
> 
> Dead center.

The thing that the bzr people don't seem to realize is that their choice 
of revision naming has serious side effects, some of them really 
technical, and limiting.

I already briought this up once, and I suspect that the bzr people simply 
DID NOT UNDERSTAND the question:

 - how do you do the git equivalent of "gitk --all"

which is just another reason why "branch-local" revision naming is simply 
stupid and has real _technical_ problems.

I really suspect that a lot of people can't see further than their own 
feet, and don't understand the subtle indirect problems that branch-local 
naming causes. 

For example, how long does it take to do an arbitrary "undo" (ie forcing a 
branch to an earlier state) in a project with tens of thousands of 
commits? That's actually a really important operation, and yes, 
performance does matter. It's something that you do a lot when you do 
things like "bisect" (which I used to approximate with BK by hand, and 
yes, re-weaving the branch history was apparently a big part of why it 
took _minutes_ to do sometimes).

Again, this is something that people don't expect to have _anything_ to do 
with revision numbering, but the fact is, it's a big part of the picture. 
If you have branch-local revision numbering, you need to renumber all 
revisions on events like this, and even if it is "just" re-creatigng the 
revno->"real ID" cache, it's actually an expensive operation exactly 
because it's going to be at least linear in history.

One of the git design requirements was that no operation should _ever_ 
need to be linear in history size, because it becomes a serious limiter of 
scalability at some point. We were seeing some of those issues with BK, 
which is why I cared.

So in git, doing things like jumping back and forth in history is O(1). 
Always (with a really low constant cost too). Of course, checking out the 
end result is then roughly O(n), but even there "n" is the size of the 
_changes_, not number of revisions or number of files.

(And there are obviously operations that _are_ O(revision history), the 
most trivial one being anything that visualizes all of history - but they 
depend on the size of history not because the operation itself gets more 
expensive, but because the dataset increases).

The whole confusing between "bzr pull" and "bzr merge" is another 
_technical_ sign of why branch-local revision numbers are a mistake. 

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html