Hi, Stolee Thanks for the review! On Tue, Oct 26, 2021 at 9:53 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 10/25/2021 5:07 PM, Matheus Tavares wrote: > > > > - Simplified the implementation by unifing the code path for cone mode and > > full pattern mode. Since path_matches_pattern_list() never returns UNDECIDED > > for cone mode, it will always execute only one iteration of the loop and then > > find the final answer. There is no need to handle this case in a separate > > block. > > This was unexpected, but makes sense. While your commit message hints at the > fact that cone mode never returns UNDECIDED, it doesn't explicitly mention > that cone mode will exit the loop after a single iteration. It might be nice > to make that explicit either in the commit message or the block comment before > the loop. Yup, will do. > > - Inside the loop, made sure to change dtype to DT_DIR when going to parent > > directories. Without this, the pattern match would fail if we had a path > > like "a/b/c" and the pattern "b/" (with trailing slash). > > Very good. We typically need to detect the type for the first path given, > but we know that all parents are directories. I've used this trick elsewhere. > > I see in the code that the first path is used as DT_REG. It's my fault, but > perhaps it should be made more clear that path_in_sparse_checkout() will > consider the given path as a file, not a directory. The current users of the > method are using it properly, but I'm suddenly worried about another caller > misinterpreting the generality of the problem. Yeah, I was thinking about this too... I'm afraid there might be at least two users of this function which already pass non-regular files to it: builtin/add.c:refresh() and sparse-index.c:convert_to_sparse_rec(). The first calls the function passing the user-given pathspec, which may be a directory. But this one is easy to solve: I think we don't even need the path_in_sparse_checkout() here as the `git add --refresh` only work on tracked files, and the previous matches_skip_worktree() call covers both skip_worktree and non-skip_worktree index entries (maybe we should rename this function to matches_sparse_ce()?) As for convert_to_sparse_rec(), it seems to call path_in_sparse_checkout() with the directory components of paths, so something like "dir/". Perhaps we can make path_in_sparse_checkout() receive a dtype argument and pass DT_UNKNOWN in this case? Another case I haven't given much thought yet is submodules. For example: git init sub && test_commit -C sub file && git submodule add ./sub && git commit -m sub && git sparse-checkout set 'sub/' && git mv sub sub2 Erroneously gives: The following paths and/or pathspecs matched paths that exist outside of your sparse-checkout definition, so will not be updated in the index: sub But it works if we change DT_REG to DT_UNKNOWN in path_in_sparse_checkout(). So, I'm not sure, should we use DT_UNKNOWN for all calls? > > @@ -1504,8 +1504,9 @@ static int path_in_sparse_checkout_1(const char *path, > > struct index_state *istate, > > int require_cone_mode) > > { > > - const char *base; > > int dtype = DT_REG; > > Here is where we assume a file to start. > > > + enum pattern_match_result match = UNDECIDED; > > + const char *end, *slash; > > > > /* > > * We default to accepting a path if there are no patterns or > > @@ -1516,11 +1517,23 @@ static int path_in_sparse_checkout_1(const char *path, > > !istate->sparse_checkout_patterns->use_cone_patterns)) > > return 1; > > > > - base = strrchr(path, '/'); > > - return path_matches_pattern_list(path, strlen(path), base ? base + 1 : path, > > - &dtype, > > - istate->sparse_checkout_patterns, > > - istate) > 0; > > + /* > > + * If UNDECIDED, use the match from the parent dir (recursively), > > + * or fall back to NOT_MATCHED at the topmost level. > > + */ > > + for (end = path + strlen(path); end > path && match == UNDECIDED; end = slash) { > > nit: since this line is long and the sentinel is complicated, it might > be worth splitting the different parts into their own lines: > > for (end = path + strlen(path); > end > path && match == UNDECIDED; > end = slash) { Good idea, I will split the lines like you did.