On Tue, 2014-11-11 at 19:49 +0700, Duy Nguyen wrote: > I've come to the last piece to speed up "git status", watchman > support. And I realized it's not as good as I thought. > > Watchman could be used for two things: to avoid refreshing the index, > and to avoid searching for ignored files. The first one can be done > (with the patch below as demonstration). And it should keep refresh > cost to near zero in the best case, the cost is proportional to the > number of modified files. > > For avoiding searching for ignored files. My intention was to build on > top of untracked cache. If watchman can tell me what files are added > or deleted since last observed time, then I can invalidate just > directories that contain them, or even better, calculate ignore status > for those files only. > > This is important because in reality compilers and editors tend to > update files by creating a new version then rename them, updating > directory mtime and invalidating untracked cache as a consequence. As > you edit more files (or your rebuild touches more dirs), untracked > cache performance drops (until the next "git status"). The numbers I > posted so far are the best case. > > The problem with watchman is it cannot tell me "new" files since the > last observed time (let's say 'T'). If a file exists at 'T', gets > deleted then recreated, then watchman tells me it's a new file. I want > to separate those from ones that do not exist before 'T'. > > David's watchman approach does not have this problem because he keeps > track of all entries under $GIT_WORK_TREE and knows which files are > truely new. But I don't really want to keep the whole file list around, > especially when watchman already manages the same list. > > So we got a few options: > > 1) Convince watchman devs to add something to make it work Based on the thread on the watchman github it looks like this won't happen. > 2) Fork watchman > > 3) Make another daemon to keep file list around, or put it in a shared > memory. > > 4) Move David's watchman series forward (and maybe make use of shared > mem for fs_cache). > > 5) Go with something similar to the patch below and accept untracked > cache performance degrades from time to time > > 6) ?? > > I'm working on 1). 2) is just bad taste, listed for completeness > only. If we go with 3) and watchman starts to support Windows (seems > to be in their plan), we'll need to rework some how. And I really > don't like 3) > > If 1-3 does not work out, we're left without 4) and 5). We could > support both, but proobably not worth the code complexity and should > just go with one. > > And if we go with 4) we should probably think of dropping untracked > cache if watchman will support Windows in the end. 4) also has another > advantage over untracked cache, that it could speed up listing ignored > files as well as untracked files. > > Comments? I don't think it would be impossible to add Windows support to watchman; the necessary functions exist, although I don't know how well they work. My experience with watchman is that it is something of a stress test of a filesystem's notification layer. It has exposed bugs in inotify, and caused system instability on OS X. My patches are not the world's most beautiful, but they do work. I think some improvement might be possible by keeping info about tracked files in the index, and only storing the tree of ignored and untracked files separately. But I have not thought this through fully. In any case, making use of shared memory for the fs_cache (as some of your other patches do for the index) would definitely save time. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html