Re: inotify daemon speedup for git [POC/HACK]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Avery Pennarun <apenwarr@xxxxxxxxx> wrote:
> 
> While we're here, it's probably worth mentioning that git's index file
> format (which stores a sequential list of full paths in alphabetical
> order, instead of an actual hierarchy) does become a bottleneck when
> you actually have a huge number of files in your repo (like literally
> a million).  You can't actually binary search through the index!  The
> current implementation of submodules allows you to dodge that
> scalability problem since you end up with multiple smaller index
> files.  Anyway, that's fixable too.

Yes.

More than once I've been tempted to rewrite the on-disk (and I guess
in-memory) format of the index.  And then I remember how painful that
stuff is in either C git.git or JGit, and I back away slowly.  :-)

Ideally the index is organized the same way the trees are, but
you still can't do a really good binary search because of the
ass-backwards name sorting rule for trees.  But for performance
reasons you still want to keep the entire index in a single file,
an index per directory (aka SVN/CVS) is too slow for the common
case of <30k files.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]