Re: [PATCH 1/3] Lazily open pack index files on demand

Nicolas Pitre <nico@xxxxxxx> · Sun, 27 May 2007 11:26:06 -0400 (EDT)

On Sat, 26 May 2007, Shawn O. Pearce wrote:

> Dana How <danahow@xxxxxxxxx> wrote:
> > Shawn:  When I first saw the index-loading code,  my first
> > thought was that all the index tables should be
> > merged (easy since sorted) so callers only need to do one search.
> 
> Yes; in fact this has been raised on the list before.  The general
> idea was to create some sort of "super index" that had a list of
> all objects and which packfile they could be found in.  This way the
> running process doesn't have to search multiple indexes, and the
> process doesn't have to be responsible for the merging itself.
> 
> See the thing is, if you read all of every .idx file on a simple
> `git-log` operation you've already lost.  The number of trees and
> blobs tends to far outweigh the number of commits and they really
> outweigh the number of commits the average user looks at in a
> `git-log` session before they abort their pager.  So sorting all
> of the available .idx files before we produce even the first commit
> is a horrible thing to do.

There is also the question of memory footprint.  If you have a global 
index, then for each object you need to have a tupple containing SHA1 + 
pack offset + reference to corresponding pack.  Right now we only need 
SHA1 + pack offset.

BTW I think the Newton-Raphson based index lookup approach should be 
revived at some point.

Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html