Re: [PATCH v2] add, rm, mv: fix bug that prevents the update of non-sparse dirs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Stolee

Thanks for the review!

On Tue, Oct 26, 2021 at 9:53 AM Derrick Stolee <stolee@xxxxxxxxx> wrote:
>
> On 10/25/2021 5:07 PM, Matheus Tavares wrote:
> >
> > - Simplified the implementation by unifing the code path for cone mode and
> >   full pattern mode. Since path_matches_pattern_list() never returns UNDECIDED
> >   for cone mode, it will always execute only one iteration of the loop and then
> >   find the final answer. There is no need to handle this case in a separate
> >   block.
>
> This was unexpected, but makes sense. While your commit message hints at the
> fact that cone mode never returns UNDECIDED, it doesn't explicitly mention
> that cone mode will exit the loop after a single iteration. It might be nice
> to make that explicit either in the commit message or the block comment before
> the loop.

Yup, will do.

> > - Inside the loop, made sure to change dtype to DT_DIR when going to parent
> >   directories. Without this, the pattern match would fail if we had a path
> >   like "a/b/c" and the pattern "b/" (with trailing slash).
>
> Very good. We typically need to detect the type for the first path given,
> but we know that all parents are directories. I've used this trick elsewhere.
>
> I see in the code that the first path is used as DT_REG. It's my fault, but
> perhaps it should be made more clear that path_in_sparse_checkout() will
> consider the given path as a file, not a directory. The current users of the
> method are using it properly, but I'm suddenly worried about another caller
> misinterpreting the generality of the problem.

Yeah, I was thinking about this too... I'm afraid there might be at
least two users of this function which already pass non-regular files
to it: builtin/add.c:refresh() and
sparse-index.c:convert_to_sparse_rec().

The first calls the function passing the user-given pathspec, which
may be a directory. But this one is easy to solve: I think we don't
even need the path_in_sparse_checkout() here as the `git add
--refresh` only work on tracked files, and the previous
matches_skip_worktree() call covers both skip_worktree and
non-skip_worktree index entries (maybe we should rename this function
to matches_sparse_ce()?)

As for convert_to_sparse_rec(), it seems to call
path_in_sparse_checkout() with the directory components of paths, so
something like "dir/". Perhaps we can make path_in_sparse_checkout()
receive a dtype argument and pass DT_UNKNOWN in this case?

Another case I haven't given much thought yet is submodules. For example:

git init sub &&
test_commit -C sub file &&
git submodule add ./sub &&
git commit -m sub &&
git sparse-checkout set 'sub/' &&
git mv sub sub2

Erroneously gives:
The following paths and/or pathspecs matched paths that exist
outside of your sparse-checkout definition, so will not be
updated in the index:
sub

But it works if we change DT_REG to DT_UNKNOWN in
path_in_sparse_checkout(). So, I'm not sure, should we use DT_UNKNOWN
for all calls?

> > @@ -1504,8 +1504,9 @@ static int path_in_sparse_checkout_1(const char *path,
> >                                    struct index_state *istate,
> >                                    int require_cone_mode)
> >  {
> > -     const char *base;
> >       int dtype = DT_REG;
>
> Here is where we assume a file to start.
>
> > +     enum pattern_match_result match = UNDECIDED;
> > +     const char *end, *slash;
> >
> >       /*
> >        * We default to accepting a path if there are no patterns or
> > @@ -1516,11 +1517,23 @@ static int path_in_sparse_checkout_1(const char *path,
> >            !istate->sparse_checkout_patterns->use_cone_patterns))
> >               return 1;
> >
> > -     base = strrchr(path, '/');
> > -     return path_matches_pattern_list(path, strlen(path), base ? base + 1 : path,
> > -                                      &dtype,
> > -                                      istate->sparse_checkout_patterns,
> > -                                      istate) > 0;
> > +     /*
> > +      * If UNDECIDED, use the match from the parent dir (recursively),
> > +      * or fall back to NOT_MATCHED at the topmost level.
> > +      */
> > +     for (end = path + strlen(path); end > path && match == UNDECIDED; end = slash) {
>
> nit: since this line is long and the sentinel is complicated, it might
> be worth splitting the different parts into their own lines:
>
>         for (end = path + strlen(path);
>              end > path && match == UNDECIDED;
>              end = slash) {

Good idea, I will split the lines like you did.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux