Re: cloning the kernel - why long time in "Resolving 313037 deltas"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Theodore Tso <tytso@xxxxxxx> wrote:
> On Mon, Dec 18, 2006 at 07:13:40PM -0500, Nicolas Pitre wrote:
> > Maybe.  However the mmap() may occur on section of the pack file which 
> > has just been written to in order to write even more, always to the same 
> > file.  On Linux this is fast because the mmap'd data is likely to still 
> > be in the cache.
> > 
> > I guess this could be turned into a malloc()/read()/free() with no 
> > trouble.
> 
> Actually, depending on the size of the chunk, even on Linux
> malloc/read/free can be faster than the mmap/munmap, because
> mmap/munmap calls involve page table manipulations, and even on Linux
> that is often slower or dead even with the memory copy involved with
> using malloc/read.  Even when reading huge chunks of Canon Raw File
> data at a time, I found (experimentally) that it was no faster to use
> mmap() compared to read().  And for small chunks of data, malloc/read
> will definitely win out over mmap(), since the page table operations
> and resulting page faults completely trump the cost of copying the
> bytes from the page cache to the read() buffer.

This is why git-fast-import mmaps 128 MiB blocks from the file at
a time.  The mmap region is usually much larger than the file itself;
the application appends to the file via write() then goes back
and rereads data when necessary via the already established mmap.
Its rare for the application to need to unmap/remap a different block
so there really isn't very much page table manipulation overhead.

Why isn't git-index-pack doing the same?  Is there some hidden glitch
in some OS somewhere that has a problem with overmapping a file and
appending into it via write()?  I've done that on Mac OS X, Linux,
BSDi, Solaris...  never had a problem.

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]