Re: [PATCH] fmt-merge-msg: avoid open "-|" list form for Perl 5.6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 24 Feb 2006, Johannes Schindelin wrote:
> 
> Sorry, but no. Really no. Pipes have several advantages over temporary 
> files:
> 
> - The second program can already work on the data before the first 
>   finishes.

This really is a _huge_ issue in general, although probably not a very 
big one in this case.

This is what I talked about when I said "streaming" data. Look at the 
difference between

	git whatchanged -s drivers/usb

and

	git log drivers/usb

in the kernel repo. They give almost the same output, but...

Notice how one starts _immediately_, while the other starts after a few 
seconds (or, if you have a slow machine, and an unpacked archive, after 
tens of seconds or longer).

And the reason is that "git log" uses "git-rev-list" with a path limiter, 
and currently that ends up having to walk basically the whole history in 
order to generate a minimal graph.

In contrast, "git-whatchanged" uses "git-diff-tree" to limit the output, 
and git-diff-tree doesn't care about "minimal graph" or crud like that: it 
just cares about discarding any local commits that aren't interesting. It 
doesn't need to worry about updating parent chains etc, so it can do it 
all incrementally - and can thus start output as soon as it gets anything 
at all.

Now, maybe you think that "a few seconds" isn't a big deal. Sure, it's 
actually fast as hell, considering what it is doing, and anybody should be 
really really impressed that we can do that at all.

But (a) it _is_ a huge deal. Responsiveness is really important. And 
worse: (b) it scales badly with repository size. Creating the whole 
data-set before starting to output it really doesn't scale.

Now, I have ways to make "git-rev-list" better. It doesn't really need to 
walk the _whole_ history for its path limiting before it can start 
outputting stuff: it really _could_ do things more incrementally. However, 
it's a real bitch sometimes to work with incremental data when you don't 
know everything, so it gets a lot more complicated. 

So my point isn't that "git log drivers/usb" will get less and less 
responsive over time. I can fix that - eventually. My point is that in 
order to make it more responsive, I need to make it less synchronous. More 
"streaming". 

And that is where a pipe is so much better than a file. It's very 
fundamentally a streaming interface.

However, I suspect some of these issues are non-issues for the perl 
programs that work with a few entries at a time.

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]