> run_diff_files() is used not just by "git diff" but other things > like "git add", so if we get an overall speed-up without having to > pay undue cost, that would be a very good news. Agreed! I may be able to write perf benchmark tests to highlight benefits to git add as well. > 20% of the working tree, running refresh_fsmonitor() for the entire > working tree is still a win, but if we are only checking less than > that, we are better off without fsmonitor, or does a tradeoff like > that exist? My understanding is that refresh_fsmonitor is O(delta_since_last_refresh) - so for developers with large repositories - this cost will amortize out over subsequent commands, so I don't think it's worth investigating this tradeoff here. As a user of large repositories, I expect that my major source of fsmonitor activity to be user intent (eg git pull, or intentionally copying/editing a large number of files). After such a command, I expect my next git command to be slower - that would be unsurprising. I think the tradeoff could be made for small diff requests, but I don't think it's worth adding complexity here - as that user will just have to pay the cost on their next git command. > > + * eg - via git udpate-index --assume-unchanged > > + * eg - via core.ignorestat=true > > ... what are these two lines doing here? Intended to indicate potential ways that CE_VALID might be set. When I was reading the source here, it was pretty difficult to determine how this would be set. Agree that I picked unfortunate wording. Thanks for the suggestions. Will update in the next iteration. > > would probably be what you meant bo say. > > When CE_VALID is set (via "update-index --assume-unchanged" > or via adding paths while core.ignorestat is set to true), > the user has promised that the working tree file for that > path will not be modified. When CE_FSMONITOR_VALID is true, > the fsmonitor knows that the path hasn't been modified since > we refreshed the cached stat information. In either case, > we do not have to stat to see if the path has been removed > or modified. > > or something like that, perhaps. Sounds good. Will clarify. I like your comment better as well. > > > + */ > > + if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) { > > Would it become easier to read, if written like this instead? > > if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { I personally find this more confusing because it involves multiple bitwise ops, but this is potentially due to me having more mental practice thinking about boolean operators vs bitwise operators. I'm more than happy to align with the common pattern of the repo. I'll change this. > > Thanks. Thank you for the thorough review!