On Tue, Jun 2, 2020 at 11:38 PM Elijah Newren <newren@xxxxxxxxx> wrote: > > On Sun, May 31, 2020 at 9:44 PM Matheus Tavares Bernardino > <matheus.bernardino@xxxxxx> wrote: > > > > On Sat, May 30, 2020 at 12:48 PM Elijah Newren <newren@xxxxxxxxx> wrote: > > > > > > On Wed, May 27, 2020 at 6:13 PM Matheus Tavares > > > <matheus.bernardino@xxxxxx> wrote: > > > > > > > > +static int in_sparse_checkout(struct strbuf *path, int prefix_len, > > > > > > This function name in_sparse_checkout() makes me think "Does the > > > working tree represent a sparse checkout?" Perhaps we could rename it > > > to path_matches_sparsity_patterns() ? > > > > > > Also, is there a reason we can't use dir.c's > > > path_matches_pattern_list() here? > > > > Oh, we do use path_matches_pattern_list() inside: > > > > > > + *match = path_matches_pattern_list(path->buf, path->len, > > > > + path->buf + prefix_len, &dtype, > > > > + sparsity, istate); > > > > + if (*match == UNDECIDED) > > > > + *match = parent_match; > > > > > How does this new function differ > > > in behavior from that function? > > > > The idea of in_sparse_checkout() is to implement a logic closer to > > what we have in clear_ce_flags_1(). Here, it is effectively a wrapper > > to path_matches_pattern_list() but with some extra logic to decide > > whether grep should search in a given entry, based on its mode, the > > match result against the sparsity patterns, and the result from the > > parent dir. > > I've had this response and one to 5/5 sitting in my draft folder for > over a day because I was hoping to go read clear_ce_flags_1() and find > out what it is. I have no idea, so your answer doesn't answer my > question... ;-) I'll try to find some time and maybe respond further > after I do. Oops, sorry for the incomplete answer. clear_ce_flags() recursively traverses the index entries, unsetting the bits specified in a given mask when the entry matches a given pattern list. (It is used in unpack-trees.c:mark_new_skip_worktree() to clear the CE_NEW_SKIP_WORKTREE bit for the matched entries.) clear_ce_flags() does use path_matches_pattern_list() but it also has to check some additional rules for cone mode (as there might be recursive matches/non-matches). These rules are implemented in clear_ce_flags_dir(). in_sparse_checkout() is a small wrapper around path_matches_pattern_list() with (1) the additional checks for cone mode, similar to what clear_ce_flags_dir() implements, and (2) the usage of the parent dir's match_result when undecided about the current path. We could just implement this directly in grep_tree(), but I thought that isolating this logic into its own static function would make grep_tree() more readable. > > > > diff --git a/t/t7817-grep-sparse-checkout.sh b/t/t7817-grep-sparse-checkout.sh > > > > new file mode 100755 > > > > index 0000000000..ce080cf572 > > > > --- /dev/null > > > > +++ b/t/t7817-grep-sparse-checkout.sh > > > > > > Looks good. Do we want to add a testcase where a file is unmerged and > > > present in the working copy despite not matching the sparsity patterns > > > (i.e. to emulate being in the middle of a merge/rebase/cherry-pick)? > > > > Sure, I can add that. But after a quick test here, it seems that the > > unmerged path doesn't have the SKIP_WORKTREE bit set. Is this how it > > should be? > > Right, the merge machinery will clear the SKIP_WORKTREE bit when it > writes out conflicted files. Also, any future 'git sparse-checkout' > commands will see the unmerged entry and avoid marking it as > SKIP_WORKTREE even though it doesn't match the sparsity patterns. > Thus, grep doesn't have to do any special checking for whether the > files are merged or not, and from your current implementation probably > doesn't look like a special case at all -- you just check the > SKIP_WORKTREE bit. > > However, I think the test still has value because the test enforces > that other areas of the code (merge, sparse-checkout) don't break the > invariants that grep is relying on. (I could see someone making a > merge change that keeps the SKIP_WORKTREE bit accidentally set even > though it writes the file out to the working tree, for example.) > Sure, merge has some tests around that, so it might be viewed as > slightly duplicative, but I see it as an interesting edge case that > exercises whether the SKIP_WORKTREE bit should really be set and since > grep expects a certain invariant about how that is handled, the > testcase will help make sure our expectations aren't violated. OK. I will add this test for the next version.