Re: [RFC] On watchman support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2014-11-11 at 19:49 +0700, Duy Nguyen wrote:
> I've come to the last piece to speed up "git status", watchman
> support. And I realized it's not as good as I thought.
> 
> Watchman could be used for two things: to avoid refreshing the index,
> and to avoid searching for ignored files. The first one can be done
> (with the patch below as demonstration). And it should keep refresh
> cost to near zero in the best case, the cost is proportional to the
> number of modified files.
> 
> For avoiding searching for ignored files. My intention was to build on
> top of untracked cache. If watchman can tell me what files are added
> or deleted since last observed time, then I can invalidate just
> directories that contain them, or even better, calculate ignore status
> for those files only.
> 
> This is important because in reality compilers and editors tend to
> update files by creating a new version then rename them, updating
> directory mtime and invalidating untracked cache as a consequence. As
> you edit more files (or your rebuild touches more dirs), untracked
> cache performance drops (until the next "git status"). The numbers I
> posted so far are the best case.
> 
> The problem with watchman is it cannot tell me "new" files since the
> last observed time (let's say 'T'). If a file exists at 'T', gets
> deleted then recreated, then watchman tells me it's a new file. I want
> to separate those from ones that do not exist before 'T'.
> 
> David's watchman approach does not have this problem because he keeps
> track of all entries under $GIT_WORK_TREE and knows which files are
> truely new. But I don't really want to keep the whole file list around,
> especially when watchman already manages the same list.
> 
> So we got a few options:
> 
> 1) Convince watchman devs to add something to make it work

Based on the thread on the watchman github it looks like this won't
happen. 

> 2) Fork watchman
> 
> 3) Make another daemon to keep file list around, or put it in a shared
>    memory.
> 
> 4) Move David's watchman series forward (and maybe make use of shared
>    mem for fs_cache).
> 
> 5) Go with something similar to the patch below and accept untracked
>    cache performance degrades from time to time
> 
> 6) ??
> 
> I'm working on 1). 2) is just bad taste, listed for completeness
> only. If we go with 3) and watchman starts to support Windows (seems
> to be in their plan), we'll need to rework some how. And I really
> don't like 3)
> 
> If 1-3 does not work out, we're left without 4) and 5). We could
> support both, but proobably not worth the code complexity and should
> just go with one.
> 
> And if we go with 4) we should probably think of dropping untracked
> cache if watchman will support Windows in the end. 4) also has another
> advantage over untracked cache, that it could speed up listing ignored
> files as well as untracked files.
> 
> Comments?

I don't think it would be impossible to add Windows support to watchman;
the necessary functions exist, although I don't know how well they work.
My experience with watchman is that it is something of a stress test of
a filesystem's notification layer.  It has exposed bugs in inotify, and
caused system instability on OS X.

My patches are not the world's most beautiful, but they do work.  I
think some improvement might be possible by keeping info about tracked
files in the index, and only storing the tree of ignored and untracked
files separately.  But I have not thought this through fully.  In any
case, making use of shared memory for the fs_cache (as some of your
other patches do for the index) would definitely save time.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]