Re: Watchman support for git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2014-05-03 at 07:52 +0700, Duy Nguyen wrote:
> On Sat, May 3, 2014 at 6:14 AM,  <dturner@xxxxxxxxxxxxxxxx> wrote:
> > The index format change might be less important with the split index;
> > I haven't investigated that since at the time I wrote these patches,
> > it didn't exist.
> 
> This is the worst case scenario of "git status" on webkit.git (182k
> files, path name 74 byte long on average), hot cache, no SSD
> 
>    366.379ms gitmodules_config:199 if (read_cache() < 0) die("index file
>      0.004ms cmd_status:1294 read_cache_preload(&s.pathspec);
>    488.433ms cmd_status:1295 refresh_index(&the_index, REFRESH_QUIE
>    456.495ms cmd_status:1299 update_index_if_able(&the_index, &inde
>     13.088ms wt_status_collect:616 wt_status_collect_changes_worktree(s)
>    706.926ms wt_status_collect:621 wt_status_collect_changes_index(s)
>    100.495ms lazy_init_name_hash:136 { int nr; if (istate->name_hash_initia
>    921.185ms wt_status_collect:622 wt_status_collect_untracked(s)
> 
> real    0m2.969s
> user    0m1.943s
> sys     0m1.021s

For me, those times are:
0m0.581s (no watchman, index v4)
0m0.465s (watchman, index v4)
0m0.445s (watchman, index v5)

That's not huge win on its own, but (a) it's better than nothing and (b)
it lays the groundwork for other improvements.

A fair amount (~12%) of the time seems to be spent in zlib; this varies
based on how the data is packed IIRC. 

> Index v4 and split index (and the following read-cache daemon,
> hopefully) 

Looking at some of the archives for read-cache daemon, it seems to be
somewhat similar to watchman, right?  But I only saw inotify code; what
about Mac OS?  Or am I misunderstanding what it is?

> should help reduce numbers of the 1st and 4th lines, I
> expect to less than 50ms each line. lazy_init_name_hash could be taken
> away with read-cache daemon also.
> 
> core.preloadindex can cut the total number of 2nd and 3rd lines by
> half. Watchman should help in these two lines, but it should do better
> than core.preloadindex.
> 
> wt_status_collect_changes_index() depends on how damaged cache-tree is
> (in this case, totally scraped). watchman does not help this either.
> We need to try to "heal" cache-tree as much as possible to reduce the
> number.
> 
> The last line could be a competition between watchman and my coming
> "untracked cache" series. I expect to cut the number in that line at
> least in half without external dependency.

I hadn't seen the "untracked cached" work (I actually finished these
patches a month or so ago but have been waiting for some internal
reviews before sending them out).  Looks interesting.  It seems we use a
similar strategy for handling ignores.

> Patch 2/3 did not seem to make it to the list by the way.. 

Thanks for your comments.  I just tried again to send patch 2/3.  I do
actually see the CC of it in my @twitter.com mailbox, but I don't see it
in the archives on the web.  Do you know if there is a reason the
mailing list would reject it?  At any rate, the contents may be found
at 
https://github.com/dturner-tw/git/commit/cf587d54fc72d82a23267348afa2c4b60f14ce51.diff

> initial
> reaction is storing the list of all paths seems too much, but I'll
> need to play with it a bit to understand it.

I wonder if it would make sense to use the untracked cache as the
storage strategy, but use watchman as the update strategy.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]