Re: Watchman support for git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2014-05-04 at 07:15 +0700, Duy Nguyen wrote:
> > I would like to merge the feature into master.  It works well for me,
> > and some of my colleagues who have tried it out.
> 
> Have you tried to turn watchman on by default, then run it with git
> test suite? That usually helps.

I have.  The tests work run fine under make, but prove sometimes freezes
due to an issue in libwatchman which I just fixed (and which I plan to
merge as soon as I can get a colleague to look the changes over).

> > I can split the vmac patch into two, but one of them will remain quite
> > large because it contains the code for VMAC and AES, which total a bit
> > over 100k.  Since the list will probably reject that, I'll post a link
> > to a repository containing the patches.
> 
> With the read-cache deamon, I think hashing cost is less of an issue,
> so new hashing algorithm becomes less important. If you store the file
> cache in the deamon's memory only, there's no need to hash anything.
> But I guess you already tried this.

I agree that with the daemon, the cost is less of an issue, but I am not
100% sure it is a non-issue; consecutive commands that need to
read/write the index can still be slowed down.

> > I'm not 100% sure how to split the watchman patch up.  I could add the
> > fs_cache code and then separately add the watchman code that populates
> > the cache.  Do you think there is a need to divide it up beyond this?
> 
> I'll need to have closer look at your patches to give any suggestions.

I have uploaded a new version (which is about 5-10% faster and which
corrects some minor changes) to https://github.com/dturner-tw/git.git on
the watchman branch.  

> Although if you don't mind waiting a bit, I can try to put my
> untracked cache patches in good shape (hopefully in 2 weeks), then you
> can mostly avoid touching dir.c and reuse my work.

If the untracked cache patches are going to make it into master, then I
would of course be willing to rewrite on top of them.  But I would also
like to have a sense of whether there is any interest in watchman
support (outside of Twitter).

For what it's worth, the numbers today for index version 4 are for my
superscience repo are:
~380 (no watchman), ~260 (untracked-cache), ~175 (watchman).

That's because untracked-cache still has to stat every directory.

> I backed away from watchman support because I was worried about its
> overhead (of watchman itself, and git/watchman IPC because it's not
> designed specifically for git), which led me to try optimizing git as
> much as possible without watchman first, then see how/if watchman can
> help on top of that. I still think it's a good approach (maybe because
> it started to make me doubt if watchman could pull a big performance
> win on top to justify the changes to support it)

I think on large repositories (especially deeply-nested ones), with the
common case of a small number of changes, watchman will end up being a
big win.  Java tends towards deep nesting
(src/main/java/com/twitter/common/...), which is probably why my test
repo had the largest speedup (>50%).  The IPC overhead might become bad
if there were a large number of changes, but so far this has not been an
issue for me in testing.  

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]