On 8/24/22 5:06 PM, Shaoxuan Yuan wrote: > On 8/17/2022 10:23 PM, Derrick Stolee wrote: >> On 8/17/2022 3:56 AM, Shaoxuan Yuan wrote: >>> Turn on sparse index and remove ensure_full_index(). >>> - /* TODO: audit for interaction with sparse-index. */ >>> - ensure_full_index(repo->index); >>> + if (grep_sparse) > > A side note: this condition should be `grep_sparse && cached`. > >>> + ensure_full_index(repo->index); >>> + >> As mentioned before, this approach is the simplest way to make the case >> without --sparse faster, but the case _with_ --sparse will still be slow. >> The way to fix this would be to modify this portion of the loop: > > I'm not sure. If --sparse here means we want to expand the index, it > is expected to be slow (ensure_full_index is slow), isn't it? > >> if (S_ISREG(ce->ce_mode) && >> match_pathspec(repo->index, pathspec, name.buf, name.len, 0, NULL, >> S_ISDIR(ce->ce_mode) || >> S_ISGITLINK(ce->ce_mode))) { >> >> by adding an initial case >> >> if (S_ISSPARSEDIR(ce->ce_mode)) { >> hit |= grep_tree(opt, &ce->oid, name.buf, 0, name.buf); >> } else if (S_ISREG(ce->ce_mode) && >> match_pathspec(repo->index, pathspec, name.buf, name.len, 0, NULL, >> S_ISDIR(ce->ce_mode) || >> S_ISGITLINK(ce->ce_mode))) { >> >> and appropriately implement "grep_tree()" to walk the tree at ce->oid to >> find all matching files within, then call grep_oid() for each of those >> paths. > > Tree walking is faster, yes. So, for this approach to be faster, I > think you are suggesting we should not expand the index, even when > --sparse is given? Instead, we just rely on the tree walking logic, > right? Yes. Tree walking is a sizeable portion of the cost of expanding the index, but we also avoid constructing the new index _and_ we can use the t1092 tests to show that we are satisfying the behavior without resorting to ensure_full_index(). It shows that we are doing the "most correct" thing. Walking trees also provides the way to speed up when focused on a pathspec, since maybe the pathspec reduces the scope of the tree search automatically (from existing tree-walking logic). Expanding the index means "walk all the trees, then scan all the files" when there might be better things to do instead. Thanks, -Stolee