On Sun, 2014-05-04 at 07:15 +0700, Duy Nguyen wrote: > > I would like to merge the feature into master. It works well for me, > > and some of my colleagues who have tried it out. > > Have you tried to turn watchman on by default, then run it with git > test suite? That usually helps. I have. The tests work run fine under make, but prove sometimes freezes due to an issue in libwatchman which I just fixed (and which I plan to merge as soon as I can get a colleague to look the changes over). > > I can split the vmac patch into two, but one of them will remain quite > > large because it contains the code for VMAC and AES, which total a bit > > over 100k. Since the list will probably reject that, I'll post a link > > to a repository containing the patches. > > With the read-cache deamon, I think hashing cost is less of an issue, > so new hashing algorithm becomes less important. If you store the file > cache in the deamon's memory only, there's no need to hash anything. > But I guess you already tried this. I agree that with the daemon, the cost is less of an issue, but I am not 100% sure it is a non-issue; consecutive commands that need to read/write the index can still be slowed down. > > I'm not 100% sure how to split the watchman patch up. I could add the > > fs_cache code and then separately add the watchman code that populates > > the cache. Do you think there is a need to divide it up beyond this? > > I'll need to have closer look at your patches to give any suggestions. I have uploaded a new version (which is about 5-10% faster and which corrects some minor changes) to https://github.com/dturner-tw/git.git on the watchman branch. > Although if you don't mind waiting a bit, I can try to put my > untracked cache patches in good shape (hopefully in 2 weeks), then you > can mostly avoid touching dir.c and reuse my work. If the untracked cache patches are going to make it into master, then I would of course be willing to rewrite on top of them. But I would also like to have a sense of whether there is any interest in watchman support (outside of Twitter). For what it's worth, the numbers today for index version 4 are for my superscience repo are: ~380 (no watchman), ~260 (untracked-cache), ~175 (watchman). That's because untracked-cache still has to stat every directory. > I backed away from watchman support because I was worried about its > overhead (of watchman itself, and git/watchman IPC because it's not > designed specifically for git), which led me to try optimizing git as > much as possible without watchman first, then see how/if watchman can > help on top of that. I still think it's a good approach (maybe because > it started to make me doubt if watchman could pull a big performance > win on top to justify the changes to support it) I think on large repositories (especially deeply-nested ones), with the common case of a small number of changes, watchman will end up being a big win. Java tends towards deep nesting (src/main/java/com/twitter/common/...), which is probably why my test repo had the largest speedup (>50%). The IPC overhead might become bad if there were a large number of changes, but so far this has not been an issue for me in testing. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html