Re: Bizarre missing changes (git bug?)

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Sun, 27 Jul 2008 22:30:59 -0700 (PDT)

On Sun, 27 Jul 2008, Linus Torvalds wrote:
> 
> And it's why gitk can start printing out the history _before_ three 
> seconds has passed. And that's really really important.

Btw, the reason it's really really important is that "three seconds" can 
actually easily be "three minutes" - if the project is big, or if you 
simply don't have everything in the cache, so you actually need to do tons 
of IO to generate the whole history.

So every normal operation absolutely _must_ be incremental and not rely on 
any calculation of the whole history in order to then simplify it.

Of course, post-processing is fine for some things. For example, in the 
thread I pointed you to originally (see filter-branch + full-history in 
google, or look in some git archive) I suggested a post-processing of the 
merge history for filter-branch. I suspect it's very acceptable for _that_ 
kind of use to "batch" things up and not do them with partial knowledge.

But this incremental thing is why I for example suggest people should use 
"git gui blame" instead of "git blame" when looking for problems - because 
the latter cannot be done incrementally, and as a result can cause really 
irritating delays (exactly because it basically needs to synchronously 
walk back to the beginning of history).

The kernel repo, btw, is pretty small in this regard. The cases that 
caused much more pain were the insane KDE ones that were something like 
ten times the size. We've optimized things pretty aggressively, but...

Btw, if I sound irritated, it's because we had all these discussions about 
three _years_ ago when git got started. This is not a new issue. It's 
hard.

I've been pushing on people to do things incrementally very hard over the 
last few years because it's such a _huge_ usability issue.

For example, I've pointed you to the incremental nature of "gitk" as an 
example of how things should work, but that's actually fairly recent: it 
wasn't that long ago that "gitk" used to pass in "--topo-order" or 
"--date-order" to the core git revision machinery, and that actually is 
another of those "global" operations that you need the whole history for.

So gitk actually used to pause for three seconds (or ten. or thirty) 
before it would show the results. I'm really happy to report that Paul 
finally did the (trivial) topo-sort in gitk, meaning that he could re-sort 
it as necessary and keep things incremental. It was one of my biggest UI 
gripes for the longest time (and I wasted time adding a special "partial 
output mode" that gitk didn't even then end up using because Paul did 
things the right way).

Btw, from a git log viewer standpoint, the "merge history simplification" 
is all the exact same problem as the "--topo-order" flag is: you could 
use the (incremental and very verbose)

	git log --full-history --parents

output as the base-line, and then you could do the commit simplification 
of things interactively.

But "git log" itself cannot do it by default, since that would mean that 
git log itself would have to wait for the whole history to be generated. 

That's because output to a pipe is fundamentally linear (ie it cannot 
"re-write" the things it has already shown as it finds a simplification: 
there is no incremental way to rewrite things "after the fact").

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html