On Tue, Feb 6, 2018 at 8:48 AM, Ben Peart <peartben@xxxxxxxxx> wrote: > With the new behavior, making a change in dir1/, then calling status would > update the dir1/ untracked cache entry but not write it out. On the next > status, git would detect the change in dir1/ again and update the untracked Thing only missing piece here I would add is, this dir1/ detection is done by watchman. We have to contact watchman and ask the set of changed paths since $TIME where $TIME is the last time we updated untracked cache and invalidate those paths in core. Then it should work correctly. I checked the watchman query in the fsmonitor hook and I think it's correct so far. Except one point. What if you activate fsmonitor, write down the fsmonitor_last_update field there but _not_ create UNTR extension for the same timestamp? UNTR extension is created only when read_directory() is executed and we don't do that in every command. I haven't checked fsmonitor.c carefully, maybe I'm missing something. One thing I'd like to point out in this patch is untracked cache will start degrading if you just make the initial UNTR extension once and don't ever update it again. Dirty paths hit slow path and will start poking the disk. If somebody accidentally change a minor thing in every single directory (worst case scenario), the untracked cache becomes useless. We need a threshold or something to start repairing UNTR extension if it gets damaged too much. If you rely on random index updates (e.g. the main entries got updated and .git/index must be updated) to write UNTR extension down together. Please don't do that, at least not this way. cache_changed mask should reflect all dirty parts in .git/index. If UNTR extension is not marked updated, it's legit to just skip generating/writing it down (e.g. if I kept the old UNTR extension from the last time I read .git/index around in memory) > cache. All of the other cached entries are still valid and the cache would > be used for them. The updated cache entry for dir1/ would not get persisted > to disk until something that required the index to be written out. > > The behavior is correct in both cases. You just don't get the benefit of > the updated cache for the dir1/ entry until the index is persisted again. > What you gain in exchange is that you don't have to write out the index > which is (typically) a lot more expensive than checking dir1/ for changes. This is another thing that bugs me. I know you're talking about huge index files, but at what size should we start this sort of optimization? Writing down a few MBs on linux is cheap enough that I won't bother optimizing (and I get my UNTR extension repaired all the time, so reduced lstat calls and stuff). This "typically" only comes at certain size, what size? -- Duy