On 10/26/2021 6:43 PM, Matheus Tavares wrote:> On Tue, Oct 26, 2021 at 9:53 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: >>> - Inside the loop, made sure to change dtype to DT_DIR when going to parent >>> directories. Without this, the pattern match would fail if we had a path >>> like "a/b/c" and the pattern "b/" (with trailing slash). >> >> Very good. We typically need to detect the type for the first path given, >> but we know that all parents are directories. I've used this trick elsewhere. >> >> I see in the code that the first path is used as DT_REG. It's my fault, but >> perhaps it should be made more clear that path_in_sparse_checkout() will >> consider the given path as a file, not a directory. The current users of the >> method are using it properly, but I'm suddenly worried about another caller >> misinterpreting the generality of the problem. > > Yeah, I was thinking about this too... I'm afraid there might be at > least two users of this function which already pass non-regular files > to it: builtin/add.c:refresh() and > sparse-index.c:convert_to_sparse_rec(). > > The first calls the function passing the user-given pathspec, which > may be a directory. But this one is easy to solve: I think we don't > even need the path_in_sparse_checkout() here as the `git add > --refresh` only work on tracked files, and the previous > matches_skip_worktree() call covers both skip_worktree and > non-skip_worktree index entries (maybe we should rename this function > to matches_sparse_ce()?) > > As for convert_to_sparse_rec(), it seems to call > path_in_sparse_checkout() with the directory components of paths, so > something like "dir/". Perhaps we can make path_in_sparse_checkout() > receive a dtype argument and pass DT_UNKNOWN in this case? This might be necessary. Thanks for digging into the details here. > Another case I haven't given much thought yet is submodules. For example: > > git init sub && > test_commit -C sub file && > git submodule add ./sub && > git commit -m sub && > git sparse-checkout set 'sub/' && > git mv sub sub2 > > Erroneously gives: > The following paths and/or pathspecs matched paths that exist > outside of your sparse-checkout definition, so will not be > updated in the index: > sub > > But it works if we change DT_REG to DT_UNKNOWN in > path_in_sparse_checkout(). So, I'm not sure, should we use DT_UNKNOWN > for all calls? This is interesting. Submodules aren't controlled by the sparse-checkout, so we should probably check the cache entry to see if it is a gitlink and skip the path_in_sparse_checkout() if so. Good find! -Stolee