Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/29/2020 5:33 AM, Jeff King wrote:
> On Wed, Jan 29, 2020 at 05:17:13AM -0500, Jeff King wrote:
> 
>>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>>>
>>> If a user somehow creates a directory with an asterisk (*) or backslash
>>> (\), then the "git sparse-checkout set" command will struggle to provide
>>> the correct pattern in the sparse-checkout file. When not in cone mode,
>>> the provided pattern is written directly into the sparse-checkout file.
>>> However, in cone mode we expect a list of paths to directories and then
>>> we convert those into patterns.
>>
>> Is this really about cone mode? It seems more like it is about --stdin.
>> I.e., what are the rules for when the input is a filename and when it is
>> a pattern? In our earlier discussion, I had assumed that command-line
>> arguments to "sparse-checkout set" were actual filenames, and "--stdin"
>> just read them from stdin.
>>
>> But looking at the documentation, they are always called "patterns" on
>> the command-line. Should the "--stdin" documentation make it clear that
>> we are no longer taking patterns, but instead actual filenames?
>>
>> Or am I confused, and in non-cone-mode the "ls-tree | sparse-checkout"
>> pipeline is not supposed to work at all? (I.e., they really are always
>> patterns)?
> 
> Hmph, sorry, I _was_ just confused. I was reading a copy of the manpage
> without your final patch, which made things much clearer.
> 
> So OK, I think the resulting documentation does make things clear. And
> this is just about cone mode, not --stdin. Please ignore my ramblings in
> the rest of the replied-to message. But...
> 
>>> Even more specifically, the goal is to always allow the following from
>>> the root of a repo:
>>>
>>>   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
>>>
>>> The ls-tree command provides directory names with an unescaped asterisk.
>>> It also quotes the directories that contain an escaped backslash. We
>>> must remove these quotes, then keep the escaped backslashes.
>>>
>>> However, there is some care needed for the timing of these escapes. The
>>> in-memory pattern list is used to update the working directory before
>>> writing the patterns to disk. Thus, we need the command to have the
>>> unescaped names in the hashsets for the cone comparisons, then escape
>>> the patterns later.
>>
>> OK, this part make sense.
> 
> You could also demonstrate this even without --stdin with something
> like:
> 
>   git config core.sparsecheckoutcone true
>   git sparse-checkout set 'foo*bar'
> 
> which should take that as a literal filename and put the pattern
> 'foo\*bar' in the sparse-checkout file. And your tests do cover that.
> 
> So really there are two separate bugs here, and it might be a little
> easier to explain the "timing of these escapes" thing by doing them
> separately. I.e., the case above needs escaping and we could demonstrate
> the bug with a command-line "set".  And then follow up by fixing the
> problem with correctly de-quoting --stdin.

I've locally split the commit into two parts. That makes things much
simpler to read.

>>> +static char *escaped_pattern(char *pattern)
>> [...]
>> Do we need to catch other metacharacters here (using is_glob_special()
>> perhaps)?
> 
> After de-confusing myself, I think the individual code comments I wrote
> still apply though (especially this one).

I've applied the smaller comments and am now investigating the right
thing to do with other is_glob_special() characters. There is a small
chance that I can replace any "c == '*' || c == '\'" with is_glob_special(),
but we shall see. At the very least, I'll need to expand my tests.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux