Re: inotify daemon speedup for git [POC/HACK]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 27, 2010 at 7:39 PM, Joshua Juran <jjuran@xxxxxxxxx> wrote:
> On Jul 27, 2010, at 4:29 PM, Avery Pennarun wrote:
>
>> An inotify daemon could easily keep track of which files have been
>> added that aren't in the index... but where would it put the list of
>> files git doesn't know about?  Do they go in the index with a special
>> NOT_REALLY_INDEXED flag?
>
> One option is not to write it to disk at all.  The client could consult the
> daemon directly.

True.  What would the client-server protocol look like, though?  "Give
me the list of unknown files?"  Does the daemon need to understand
.gitignore or will it send back a list of all my million *.o files
every time?  etc.

Offhandedly, I think it would be nice to have an inotify daemon just
maintain (something like) the git index file where it just has a list
of *all* the files in a form that's a) random access, not just
sequential, and b) really fast when accessed sequentially.

Knowing that large numbers of files can cause slowness, I was planning
ahead for inotify when I designed bup's index file format, and it
meets the above criteria.  Unfortunately I screwed up other stuff
(adding new files is too slow) and it still needs to be rewritten
anyway.  Oh well.

While we're here, it's probably worth mentioning that git's index file
format (which stores a sequential list of full paths in alphabetical
order, instead of an actual hierarchy) does become a bottleneck when
you actually have a huge number of files in your repo (like literally
a million).  You can't actually binary search through the index!  The
current implementation of submodules allows you to dodge that
scalability problem since you end up with multiple smaller index
files.  Anyway, that's fixable too.

Have fun,

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]