Re: [PATCH][RFC] Add git-archive-tree

Rene Scharfe <rene.scharfe@xxxxxxxxxxxxxx> · Sun, 17 Sep 2006 13:54:13 +0200

Junio C Hamano schrieb:
> Rene Scharfe <rene.scharfe@xxxxxxxxxxxxxx> writes:
> 
>> I then let the two chew away on the kernel repository.  And as 
>> kcachegrind impressively shows, all we do with our trees and 
>> objects is dwarfed by inflate().
> 
> The diff output codepath has a logic that says "if the blob we are 
> dealing with has the same object name as the corresponding blob in 
> the index, and if the index entry is clean (i.e. it is known that the
>  file sitting in the working tree matches the blob), then do not 
> inflate() but use data from that file instead".

Nice idea.  The tree traverser would need to provide the filenames
relative to the current working directory in addition to the
filenames as they are written to the archive.  I guess your para-walk
tree walker could be useful here.  I sadly haven't found the time to
look at it, yet, and now it even vanished from the pu branch.

A read is an order of magnitude faster than a deflate of the same data,
at least that's what I guess from comparing the runtimes of git-tar-tree
and tar.  _However_, this doesn't account for I/O costs (in my tests the
repo and all checked-out files were cache hot) and for any compression
that would certainly be applied to the resulting archive.  So the full
runtime of archive creation wouldn't be that much shorter.

René
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html