Re: [RFC \ WISH] Add -o option to git-rev-list

"Marco Costalba" <mcostalba@xxxxxxxxx> · Mon, 11 Dec 2006 08:17:15 +0100

On 12/11/06, Linus Torvalds <torvalds@xxxxxxxx> wrote:

How about you just compare something simpler:

        git-rev-list | cat > /dev/null

vs

        git-rev-list > tmpfile ; cat tmpfile > /dev/null

and see which one works better.

These are tipical values (warm cache):

 $ time git rev-list --header --boundary --parents --topo-order HEAD  /dev/null
3.04user 0.05system 0:03.09elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+10141minor)pagefaults 0swaps

$ time git rev-list --header --boundary --parents --topo-order HEAD |
cat > /dev/null
3.67user 0.36system 0:04.29elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+18033minor)pagefaults 0swaps

$ time git rev-list --header --boundary --parents --topo-order HEAD >
/tmp/tmp.txt; cat /tmp/tmp.txt > /dev/null
3.44user 0.28system 0:03.74elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+18033minor)pagefaults 0swaps

For some reason the CPU *never* goes up 93% with pipe (while it's easy
in the other two cases) and I repeated that test at lest 10 times
consecutively. Is it perhaps the signature of some blocking around? Or
too much high frequency of receiver's read call (see below) ?

This is your problem:

>               guiUpdateTimer.start(100, true);

rather than just blindly starting a timer, you should ask it to wait until
more data is available.

OK. I just don't understand how after waiting 100ms I get only 60KB
and stay in the loop only one cycle instead of reading, for many
cycles, much more then 60KB and then wait another 100ms and found
another big (about 1MB) amount of data ready to be read as with the
file case.

Perhaps the pipe buffers are small and block the writer when full. In
the file case when I come back in the loop after 100ms I found _a lot_
of data to be read and I stay in the loop for much more then 1 cycle
before to wait again 100ms.

On the other hand, going to read each say less then 10ms is exactly
what QProcess (socket based) was doing and I ended up with my read
function being called at furious pace slowing down everything.
Experimenting with QProcess I found that, for performance reasons,
it's better to read big chunks few times then small chunks a lot of
times.

 Marco

P.S: 30MB for 64KB each chunk it's 468, in 3.67s it's 1 call each
7.8ms. If the pipe calls the receiver for data ready after each 4KB
(old kernels) then we should have 7.500 calls. Impossible to read in
3.67s, it would be a theoretical 1 call each 0.48ms average.

So IMHO bigger buffers for each read call could be the way to get
speed and the temporary file does exactly that.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html