Re: [PATCH] submodule operations: tighten pathspec errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 1, 2016 at 2:14 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Stefan Beller <sbeller@xxxxxxxxxx> writes:
>
>> On Thu, May 26, 2016 at 1:00 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>
>>>> @@ -36,10 +37,9 @@ static int module_list_compute(int argc, const char **argv,
>>>>
>>>>       for (i = 0; i < active_nr; i++) {
>>>>               const struct cache_entry *ce = active_cache[i];
>>>> -
>>>> -             if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
>>>> -                                 0, ps_matched, 1) ||
>>>> -                 !S_ISGITLINK(ce->ce_mode))
>>>> +             if (!S_ISGITLINK(ce->ce_mode) ||
>>>> +                 !match_pathspec(pathspec, ce->name, ce_namelen(ce),
>>>> +                                 0, ps_matched, 1))
>>>>                       continue;
>>>
>>> OK, this is the crucial bit in this patch. pathspec matches are now
>>> done only against gitlinks, so any unmatch is "pattern or path
>>> given, but there was no such submodule".
>>
>> right.
>
> That changes the semantics, and its user visible effect may deserve
> to be in the documentation, no?
>
> It used to be that "git submodule--helper list COPYING" did not
> complain, but with this change, it would.  Which may be a good
> change, but "git submodule--helper list sub*" where most but not all
> of glob expansion done by the shell are submodule directories may be
> a common thing people may do, and it may be irritating to see the
> unmatch complaints.  I dunno.

I fixed all the variable names and was confident with what I had,
but this is an issue indeed as the error message is plain wrong:
    git$ git submodule--helper list sub*
    error: pathspec 'submodule.c' did not match any file(s) known to git.
    error: pathspec 'submodule-config.c' did not match any file(s) known to git.
    error: pathspec 'submodule-config.h' did not match any file(s) known to git.
    error: pathspec 'submodule-config.o' did not match any file(s) known to git.
    error: pathspec 'submodule.h' did not match any file(s) known to git.
    error: pathspec 'submodule_multiple_references' did not match any
file(s) known to git.
    error: pathspec 'submodule.o' did not match any file(s) known to git.
    error: pathspec 'submodulespec.o' did not match any file(s) known to git.

(I realize I have some stray object files laying around)

So I do not think we can do

-               if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
-                                   0, ps_matched, 1) ||
-                   !S_ISGITLINK(ce->ce_mode))
+               if (!S_ISGITLINK(ce->ce_mode) ||
+                   !match_pathspec(pathspec, ce->name, ce_namelen(ce),
+                                   0, ps_matched, 1))

or we need to have custom error reporting, which checks if any file
still matches the pathspec (and ignore non gitlinks)

>
> When we know we are not going to complain (i.e. --unmatch-ok option

I'd rather go with ignore-unmatch as git-rm does use that.

> is given from the command line), I think it is perfectly fine (and
> it is even preferrable) to swap the order of the check.  The mode
> check done with S_ISGITLINK() is much cheaper and it is likely to
> yield true much less often, which in turn would allow us to make
> fewer calls to more expensive match_pathspec().

As said, in that case we would then need a
    pathspec_mark_ps_matched(ce->name, ce_namelen(ce), ps_matched)

>
> But when we want to diagnose typo (i.e. --unmatch-ok was not given),
> we may want to preserve the current order, so that the "sub*"
> example in a few paragraphs ago would not irritate the users.

I see.

     if (!S_ISGITLINK(ce->ce_mode)) {
        pathspec_mark_ps_matched(...);
        continue;
    } else {
        if ( !match_pathspec(pathspec, ce->name, ce_namelen(ce),
                                     0, ps_matched, 1))
        continue;
    }
    ...

doesn't look very appealing, so I guess we'd just keep the current behavior
of
-               if (!match_pathspec(pathspec, ce->name, ce_namelen(ce),
-                                   0, ps_matched, 1) ||
-                   !S_ISGITLINK(ce->ce_mode))


>
>>> It is tempting to update report_path_error() return "OK" when its
>>> first parameter is NULL.
>>
>> such that we can do a
>>
>>     if (report_path_error(unmatch_ok ? NULL : ps_matched, pathspec, prefix)))
>>         result = -1;
>
> I actually was thinking about setting ps_matched to NULL so that
> both match_pathspec() and report_path_error() would get NULL.
> match_pathspec() has to do _more_ work when ps_matched[] aka seen[]
> is given.

Yes for one call; No from the birds eye view;
the first lines that use the seen[]:

    if (seen && seen[i] == MATCHED_EXACTLY)
        continue;

so if we have  `ps->nr > 0` then it is beneficial to have a
seen array, I think?

>
>>>> @@ -407,7 +410,7 @@ cmd_foreach()
>>>>       # command in the subshell (and a recursive call to this function)
>>>>       exec 3<&0
>>>>
>>>> -     git submodule--helper list --prefix "$wt_prefix"|
>>>> +     git submodule--helper list ${unmatch:+--unmatch} --prefix "$wt_prefix"|
>>>
>>> For this to work, somebody must ensure that the variable unmatch is
>>> either unset or set to empty unless the user gave --error-unmatch to
>>> us.  There is a block of empty assignments hear the beginning of
>>> this file for that very purpose, i.e. resetting a stray environment
>>> variable that could be in user's environment.

done.

>>>
>>> The patch itself does not look too bad.  I do not have an opinion on
>>> which one should be the default, and I certainly would understand if
>>> you want to keep the default loose (i.e. not complaining) with an
>>> optional error checking, but whichever default you choose, the
>>> option and variable names need to be clarified to avoid confusion.
>>
>> Ok I'll fix the variable names; I think for consistency with e.g.
>> ls-files --error-unmatch
>> we would want to be loose by default and strict on that option.
>
> I do not think the "typo-prevention" safety measure should suddenly
> be turned into opt-in; I'd suggest "--unmatch-ok" instead.

--ignore-unmatch has the same meaning and is taken by git-rm already?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]