Re: [WIP v2 4/5] mv: add check_dir_in_index() and solve general dir check issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 27, 2022 at 11:27 PM Derrick Stolee
<derrickstolee@xxxxxxxxxx> wrote:
>
> On 5/27/2022 6:08 AM, Shaoxuan Yuan wrote:
> > +/*
> > + * Check if an out-of-cone directory should be in the index. Imagine this case
> > + * that all the files under a directory are marked with 'CE_SKIP_WORKTREE' bit
> > + * and thus the directory is sparsified.
> > + *
> > + * Return 0 if such directory exist (i.e. with any of its contained files not
> > + * marked with CE_SKIP_WORKTREE, the directory would be present in working tree).
> > + * Return 1 otherwise.
> > + */
> > +static int check_dir_in_index(const char *name, int namelen)
> > +{
> > +     int ret = 1;
> > +     const char *with_slash = add_slash(name);
> > +     int length = namelen + 1;
> > +
> > +     int pos = cache_name_pos(with_slash, length);
> > +     const struct cache_entry *ce;
> > +
> > +     if (pos < 0) {
> > +             pos = -pos - 1;
> > +             if (pos >= the_index.cache_nr)
> > +                     return ret;
> > +             ce = active_cache[pos];
> > +             if (strncmp(with_slash, ce->name, length))
> > +                     return ret;
> > +             if (ce_skip_worktree(ce))
> > +                     return ret = 0;
>
> This appears to check if the _first_ entry under the directory
> is sparse, but not if _all_ entries are sparse. These are not
> the same thing, even in cone-mode sparse-checkout. The t1092
> test directory has files like "folder1/0/0/a" but if
> "folder1/1" is in the sparse-checkout cone, then that first
> entry has the skip-worktree bit, but "folder1/1/a" and "folder1/a"
> do not.

Yes, it is checking the first entry and this would not work without the
lstat in the front. But I think the "lstat < 0" makes sure that this directory
cannot be partially sparsified.

It is either missing both in the worktree and index, or missing in the worktree
but present in index (with all its content sparsified). And because of that,
I think only the first entry needs to be checked.

> > +     }
> > +     return ret;
>
> At the moment, it doesn't seem like we need 'ret' since the
> only place you set it is in "return ret = 0;" (which could
> just be "return 0;" while the others are "return 1;"). But,
> perhaps you intended to create a loop over 'pos' while
> with_slash is a prefix of the cache entry?

I agree that this variable is redundant. But I fail to understand
the logical relation between before "But," and after "But,". Please
elaborate on that?

> > +                     else if (!check_dir_in_index(src, length) &&
> > +                                      !path_in_sparse_checkout(src_w_slash, &the_index)) {
>
> style-nit: You'll want to align the different parts of your
> logical statement to agree with the end of the "else if (",
>
>         else if (A &&
>                  B) {
>

This one is interesting because it appears just alright in my VSCode editor.
Later I found that it is because git-diff is using a tab size of 8 or something,
but my VSCode uses tab size of 4. After I configured the git-diff tab rendering
size, it looks alright. Same for another style nit down below.

> > +                             modes[i] = SKIP_WORKTREE_DIR;
>
> If we are moving to a flags-based model, should we convert all
> "modes[i] =" to "modes[i] |=" as a first step (before adding the
> SKIP_WORTKREE_DIR flag)?
>
> > +                             goto dir_check;
>
> Hm. While I did recommend using 'goto' to jump to a common end
> place in the loop body, I'm not sure about jumping into another
> else-if statement. This might be a good time to extract the
> code from "else if (src_is_dir)" below into a helper method that
> can be used in both places.

Right, this is suspicious. I wasn't familiar at all with C/C++, and being able
to do this inter-if-else jump also startled me.
I agree that it should be something more legitimate, like extracting a
method for it.

> > +                     }
> >                       /* only error if existence is expected. */
> >                       else if (modes[i] != SPARSE)
> >                               bad = _("bad source");
> > @@ -218,7 +264,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                               && lstat(dst, &st) == 0)
> >                       bad = _("cannot move directory over file");
> >               else if (src_is_dir) {
> > -                     int first = cache_name_pos(src, length), last;
> > +                     int first, last;
> > +dir_check:
> > +                     first = cache_name_pos(src, length);
> >
> >                       if (first >= 0)
> >                               prepare_move_submodule(src, first,
> > @@ -229,7 +277,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                       else { /* last - first >= 1 */
> >                               int j, dst_len, n;
> >
> > -                             modes[i] = WORKING_DIRECTORY;
> > +                             if (!modes[i])
> > +                                     modes[i] |= WORKING_DIRECTORY;
>
> This appears to only add the WORKING_DIRECTORY flag if modes[i] is
> already zero. This maybe implies that we wouldn't understand
> "WORKING_DIRECTORY | SKIP_WORKTREE_DIR" as a value.

At this point, I cannot think of the reason for writing it this way. And yes,
this does not make sense...

> >                               n = argc + last - first;
> >                               REALLOC_ARRAY(source, n);
> >                               REALLOC_ARRAY(destination, n);
> > @@ -331,7 +380,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                       printf(_("Renaming %s to %s\n"), src, dst);
> >               if (show_only)
> >                       continue;
> > -             if (mode != INDEX && mode != SPARSE && rename(src, dst) < 0) {
> > +             if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
> > +                     rename(src, dst) < 0) {
>
> style-nit: align your logical statements.
>
> >                       if (ignore_errors)
> >                               continue;
> >                       die_errno(_("renaming '%s' failed"), src);
> > @@ -345,7 +395,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
> >                                                             1);
> >               }
> >
> > -             if (mode == WORKING_DIRECTORY)
> > +             if (mode & (WORKING_DIRECTORY | SKIP_WORKTREE_DIR))
> >                       continue;
>
> Ok, here you check if _either_ mode is enabled, which is good. Maybe
> you don't need the "if (!mode[i])" part above.
>
> Thanks,
> -Stolee

-- 
Thanks & Regards,
Shaoxuan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux