Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Duy Nguyen <pclouds <at> gmail.com> writes:

> 
> On Thu, Jun 30, 2016 at 7:55 PM, Ben Peart <peartben <at> gmail.com> wrote:
> > David Turner <novalis <at> novalis.org> writes:
> >
> >>
> >> Hiding watchman behind index-helper means you need both daemons. You
> >> can't run watchman alone. Not so good. But on the other hand, 'git'
> >> binary is not linked to watchman/json libraries, which is good for
> >> packaging. Core git package will run fine without watchman-related
> >> packages. If they need watchman, they can install git-index-helper and
> >> dependencies.
> >>
> >
> > Have you considered splitting index-helper and watchman apart?  Using
> > Watchman to not lstat unchanged entries is a huge perf win with very
> > large repos.
> 
> On large repos (i.e. lots of files/dirs on worktree), the cost of
> reading index will increase proportionally. Yes lstat costs, but I
> suspect index reading (integrity verification actually) may cost more,
> especially on platforms with cheap lstat like linux. On these repos
> you really want to enable all four: index-helper (with watchman),
> split-index (I still need to work out pruning on split-index) and
> untracked cache. There's still a lot more to make git run fast on
> large repos though.
> 

I've found (at least on Windows) that as the repo size gets larger, the
time to read the index becomes a much smaller percentage of the overall
time.  I just captured some perf traces of git status on a large repo we
have.  Of that, 92.5% was spent in git!read_directory and only 4.0% was 
spent in git!read_index.  Of that 4%, 2.6% was git!glk_SHA1_Update.

Given there were no dirty files, Watchman would have made a huge 
improvement in the overall time but index helper would have had
relatively little impact.  We've noticed this same pattern in all our
repos which is what is driving our interest in the Watchman model and
also shows why index-helper is of less interest.

> > It would also be interesting to make the Watchman backend replaceable by
> > using an extensible API.  This has the benefit of not having to link the
> > 'git' binary to the watchman/json libraries.
> 
> 'git' binary is not linked to watchman libraries. git-index-helper is
> a separate binary, by design. In theory you can create a
> 'git-index-helper' replacement binary with something other than
> watchman. I think David documented the protocol well (it may change in
> the future though and we are not prepared for capability progression)
> 
> > Is there any pattern already in git for accomplishing this?

While the current design hides watchman behind index-helper, if we were
to change that model so they were independent, we would be interested
in doing it in such a way that provided some abstraction so that it 
could be replaced with another file watching daemon.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]