Duy Nguyen <pclouds <at> gmail.com> writes: > > First step would be enabling that because besides directory > traversing, this code does a lot of string processing that's hopefully > eliminated by untracked cache extension. I cut git-status' time in > half with it. The side effect though, is that it creates a new flow of > stat(), one per directory. It would be great if you could do some > measurements with untracked cache on Windows and see if we get similar > gain. These numbers were captured with core.fscache and core.untrackedcache both set to true in the belief that it would produce the best performance. If that is incorrect, I'm happy to capture numbers with other options set. If you drill the next step down into the call tree, the bulk of the cost of read_directory is coming from mingw_stat (85.6%). These numbers are inclusive in that they reflect the of the function plus any of its callees. If you look at the overall cost of functions exclusively (ie that only include the cost of the function and not it's children), kernelbase!CreateFileW, kernelbase!CloseHandle, and kernebase!GetFileInformationByHandle dominate (84.9% of the time). These functions form the basis of the stat emulation on Windows although the fscache certainly has an impact on what is happening here as well. While we can certainly work to speed these up, the biggest performance win will be eliminating as many of these calls as possible which is what we are hoping to accomplish by using Watchman to limit the files that _need_ a stat call. > > Given there were no dirty files, Watchman would have made a huge > > improvement in the overall time but index helper would have had > > relatively little impact. We've noticed this same pattern in all our > > repos which is what is driving our interest in the Watchman model and > > also shows why index-helper is of less interest. > > Assuming that untracked cache cuts git-status time by half on Windows > as well, then index read time now takes a bigger percentage, 8%, where > it starts to make sense to optimize it. > > On a quiet repository, having watchman does not help so much because > we already reduce the significant number of filesystem-related system > calls. Yes there is lstat() and eliminating it may gain some more > (with watchman) and it matters on a repo with lots of directories. You > probably can just take these lstat out (I can help point out where) > and see how much the gain is. I don't understand why Watchman won't provide a _significant_ improvement here. My understanding is that it will minimize the stat calls to those files that may have changed as well as limiting the untracked cache to only scanning those directories that may have changes in them. In this particular scenario, _no_ files have changed so Watchman would return an empty set thus eliminating virtually the stat calls and directory enumerations. I'd expect this to result in a >90% savings. Am I missing something? > Assuming (blindly again) that removing lstat helps like 10% of > read_index(), we would need to profile untracked cache code and see > where what we can do next. There are still a lot of directory > traversing there (except that it traverses the cache instead of > filesystem) and maybe we can do something. But I haven't gotten that > far. > > > While the current design hides watchman behind index-helper, if we were > > to change that model so they were independent, we would be interested > > in doing it in such a way that provided some abstraction so that it > > could be replaced with another file watching daemon. > > Frankly I'm not that interested in replacing another file watching > daemon. My first attempt at this problem was "file-watcher" which was > tied to Linux inotify system only and it would make sense to make it > replaceable. But watchman supports OS X, Linux, FreeBSD and Windows > now, not just Linux only as before, why another abstraction layer? You > could even replace "watchman.exe" binary. As long as you produce the > same json data, your new daemon should still work. This is a simplification it would be nice to make as we have other code already running that can report on all the changes happening. It would enable us to remove the additional complexity of the Watchman daemon. I'm sure we can make it work either way by emulating the Watchman interface and output. > Tying index caching daemon and file watching daemon (let's avoid > "watchman" for now) gives us a bonus. Because both git and the caching > daemon know that they read the same index, we could answer the > question "what files are dirty?" with "file number 1, 5, 9 in the > index" instead of sending full paths and git has to make some more > lookups to identify them. In this series we send it as a compressed > bit map. Probably not a big deal in terms of real performance, but it > feels nice to do lookups less. Today, Watchman returns a list of files and directories and then creates the compressed bitmap that index-helper uses. The work of looking those entries up in the index and then creating the bitmap still has to happen so I suspect you are correct that it doesn't make much of a real performance difference. It's just moving where that lookup and bitmap creation happens. I'm in the process of prototyping this, and currently skip much of the process of iterating through the list of changed files, looking up the entry in the index, creating the bitmap, passing that bitmap through the WAMA section to index-helper reading the WAMA section in git and then looping through the bitmap to set the CE_WATCHMAN_DIRTY bit on the corresponding index entries and updating the untracked cache. Instead, I iterate through the list of changed files, look up the entry in the index and directly set the dirty bit all within read-cache.c. At this point, it's <100 lines of code. I'll keep fleshing this out and get some perf numbers once it's working. While this saves several steps and reduces complexity (index-helper and the WAMA is section are no longer involved), it does mean that read-cache.c needs a platform independent way to query and receive the list of modified files and directories. It also means we need a way to store the date/time string passed to the file watching daemon in the index. Using the WAMA section just to store the date/time seems a bit heavy but it can work. > The second reason is because watchman daemon alone does not provide > enough information to reduce untracked cache's lstat() as much as > possible. The current approach in this series is a naive one, which > works for some cases, but not optimal (I'll get to that). We need a > separate long-running daemon to maintain extra info to reduce lstat(). > Because our target is watchman, it does not make sense to add yet > another daemon besides index-helper to do this. > > OK the optimal lstat() reduction thing. Right now, if any file in a > directory is updated, the directory is invalidated in untracked cache > and we need to traverse it to collect excluded files again. But it > does not have to be that way. We don't care if any file is _updated_ > because it will not change untracked cache output. We care about what > files are _added_ or _deleted_. New files will need to be classified > as either tracked, untracked or ignored. Deleted files may invalid > either three file lists. Watchman cannot answer "what files are added > or deleted since the point X in time" and I agree that it's not > watchman's job (watchman issue 65). So we have to maintain some more > info by ourselves, e.g. the list of files at any requested "clock". > With that we can compare the file lists of two "clock"s and tell git > what files are added or deleted. This sounds like a nice optimization but right now I'm focused on how we can scope the cost of status to be limited to the files and directories that may have changes. With large repos, this is a small subset of the overall repo and will hopefully be enough to make the performance reasonable. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html