On Tue, Jul 27, 2010 at 7:39 PM, Joshua Juran <jjuran@xxxxxxxxx> wrote: > On Jul 27, 2010, at 4:29 PM, Avery Pennarun wrote: > >> An inotify daemon could easily keep track of which files have been >> added that aren't in the index... but where would it put the list of >> files git doesn't know about? Do they go in the index with a special >> NOT_REALLY_INDEXED flag? > > One option is not to write it to disk at all. The client could consult the > daemon directly. True. What would the client-server protocol look like, though? "Give me the list of unknown files?" Does the daemon need to understand .gitignore or will it send back a list of all my million *.o files every time? etc. Offhandedly, I think it would be nice to have an inotify daemon just maintain (something like) the git index file where it just has a list of *all* the files in a form that's a) random access, not just sequential, and b) really fast when accessed sequentially. Knowing that large numbers of files can cause slowness, I was planning ahead for inotify when I designed bup's index file format, and it meets the above criteria. Unfortunately I screwed up other stuff (adding new files is too slow) and it still needs to be rewritten anyway. Oh well. While we're here, it's probably worth mentioning that git's index file format (which stores a sequential list of full paths in alphabetical order, instead of an actual hierarchy) does become a bottleneck when you actually have a huge number of files in your repo (like literally a million). You can't actually binary search through the index! The current implementation of submodules allows you to dodge that scalability problem since you end up with multiple smaller index files. Anyway, that's fixable too. Have fun, Avery -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html