On 1/30/2018 6:16 PM, Ævar Arnfjörð Bjarmason wrote:
On Tue, Jan 30 2018, Ben Peart jotted:
While some of these issues have been discussed in other threads, I
thought I'd summarize my thoughts here.
Thanks for this & your other reply. I'm going to get to testing some of
Duy's patches soon, and if you have some other relevant WIP I'd be happy
to try them, but meanwhile replying to a few of these:
On 1/26/2018 7:28 PM, Ævar Arnfjörð Bjarmason wrote:
I just got around to testing this since it landed, for context some
previous poking of mine in [1].
Issues / stuff I've noticed:
1) We end up invalidating the untracked cache because stuff in .git/
changed. For example:
01:09:24.975524 fsmonitor.c:173 fsmonitor process '.git/hooks/fsmonitor-watchman' returned success
01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback '.git/config'
01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback '.git/index'
01:09:25.122726 fsmonitor.c:91 write fsmonitor extension successful
Am I missing something or should we do something like:
diff --git a/fsmonitor.c b/fsmonitor.c
index 0af7c4edba..5067b89bda 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *que
static void fsmonitor_refresh_callback(struct index_state *istate, const char *name)
{
- int pos = index_name_pos(istate, name, strlen(name));
+ int pos;
+
+ if (!strcmp(name, ".git") || starts_with(name, ".git/"))
+ return;
+
+ pos = index_name_pos(istate, name, strlen(name));
if (pos >= 0) {
struct cache_entry *ce = istate->cache[pos];
With that patch applied status on a large repo[2] goes from a consistent
~180-200ms to ~140-150ms, since we're not invalidating some of the UC
structure
I favor making this optimization by updating
untracked_cache_invalidate_path() so that it ignores paths under
get_git_dir() and doesn't invalidate the untracked cache or flag the
index as dirty.
*nod*
2) We re-write out the index even though we know nothing changed
When you first run with core.fsmonitor it needs to
mark_fsmonitor_clean() for every path, but is there a reason for why we
wouldn't supply the equivalent of GIT_OPTIONAL_LOCKS=0 if all paths are
marked and we know from the hook that nothing changed? Why write out the
index again?
Writing out the index when core.fsmonitor is first turned on is
necessary to get the index extension added with the current state of
the dirty flags. Given it is a one time cost, I don't think we have
anything worth trying to optimize here.
Indeed, that makes sense. What I was showing here is even after the
initial setup we continue to write it out when we know nothing changed.
We do that anyway without fsmonitor, but this seemed like a worthwhile
thing to optimize.
There is already logic to avoid writing out the index unless there is
something that requires it. In my testing, it was often the untracked
cache marking the index as dirty that was causing the unexpected writes.
The patch to make the untracked cache only flag the index as dirty when
the feature is being turned on or off is pretty simple (see below). The
challenge was that many of the untracked cache tests assume all changes
are saved to disk after every command so they fail after making this
change. The patch below does a simple hack of checking for a test
environment variable and only forcing the index writes if it is set.
Internally, we simply turned off the untracked cache as it's usefulness
in combination with GVFS is limited and without the patch, it actually
slowed down common operations more than it sped them up.
Typically, changes to the untracked cache don't accumulate long before
the user does some operation that requires the index to be written out
at which point the untracked cache is updated as well.
diff --git a/dir.c b/dir.c
index 5e93a1350b..af1d33aae1 100644
--- a/dir.c
+++ b/dir.c
@@ -2256,7 +2256,8 @@ int read_directory(struct dir_struct *dir, struct
index_state *istate,
dir->untracked->gitignore_invalidated,
dir->untracked->dir_invalidated,
dir->untracked->dir_opened);
- if (dir->untracked == istate->untracked &&
+ if (getenv("GIT_TEST_UNTRACKED_CACHE") &&
+ dir->untracked == istate->untracked &&
(dir->untracked->dir_opened ||
dir->untracked->gitignore_invalidated ||
dir->untracked->dir_invalidated))
diff --git a/t/t7063-status-untracked-cache.sh
b/t/t7063-status-untracked-cache.sh
index ea9383e8cb..e5811b6ef2 100755
--- a/t/t7063-status-untracked-cache.sh
+++ b/t/t7063-status-untracked-cache.sh
@@ -14,6 +14,8 @@ test_description='test untracked cache'
# See <20160803174522.5571-1-pclouds@xxxxxxxxx> if you want to know
# more.
+export GIT_TEST_UNTRACKED_CACHE=true
+
sync_mtime () {
if test_have_prereq BUSYBOX
then