Re: [RFC] super indexes to span multiple packfiles

Nicolas Pitre <nico@xxxxxxx> · Tue, 29 May 2007 12:19:13 -0400 (EDT)

On Tue, 29 May 2007, Jon Smirl wrote:

> Object's are not accessed in random order with git. Once an object
> reference hits a pack file it is very likely that following references
> will hit the same pack file. That's because you always find object
> SHA's by following the chains.
> 
> So first place to look for an object is the same place the previous
> object was found. If it isn't there order the search of the pack files
> by creation data (just a heuristic). Make this list a circle and start
> the search in the pack where the previous object was found. This can
> all be done with the existing indexes.
> 
> I haven't been reading all of the messages on this subject, but is
> this strategy enough to eliminate the need for a super index?

I think it could.

Personally I'm not a big fan of the super index notion.  It needs extra 
maintenance to keep in synch, and when it is not in synch it requires 
extra work at run time to fall back to traditional lookup.  And Shawn's 
testing didn't provide significant performance gains either.

But a simple heuristic like the presumption that the next object is 
likely to be in the same pack as the previous is the kind of thing that 
could provide significant improvements with really little effort.

Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html