Re: [PATCH] Teach "git add" and friends to be paranoid

Nicolas Pitre <nico@xxxxxxxxxxx> · Mon, 22 Feb 2010 10:40:59 -0500 (EST)

On Sun, 21 Feb 2010, Junio C Hamano wrote:

> Dmitry Potapov <dpotapov@xxxxxxxxx> writes:
> 
> > If you look at speed-up numbers, you can think that the numbers are
> > unstable, but in fact, the best time in 5 runs does not differ more
> > than 0.01s between those trials. But because difference for >=128Kb
> > is 0.05s or less, the accuracy of the above numbers is less than 25%.
> 
> Then wouldn't it make the following statement...
> 
> > But overall the outcome is clear -- read() is always a winner.
> 
> "... a winner, below 128kB; above that the difference is within noise and
> measurement error"?

read() is not always a winner.  A read() call will always have the data 
duplicated in memory.  Especially with large files, it is more efficient 
on the system as a whole to mmap() a 50 MB file rather than allocating 
an extra 50 MB of anonymous memory that cannot be paged out (except to 
the swap file which would be yet another data duplication).  With mmap() 
when there is memory pressure the read-only mapped memory is simply 
dropped with no extra IO.

So when read() is not _significantly_ faster than mmap() then it should 
not be used.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html