Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 29, 2020 at 05:17:13AM -0500, Jeff King wrote:

> > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> > 
> > If a user somehow creates a directory with an asterisk (*) or backslash
> > (\), then the "git sparse-checkout set" command will struggle to provide
> > the correct pattern in the sparse-checkout file. When not in cone mode,
> > the provided pattern is written directly into the sparse-checkout file.
> > However, in cone mode we expect a list of paths to directories and then
> > we convert those into patterns.
> 
> Is this really about cone mode? It seems more like it is about --stdin.
> I.e., what are the rules for when the input is a filename and when it is
> a pattern? In our earlier discussion, I had assumed that command-line
> arguments to "sparse-checkout set" were actual filenames, and "--stdin"
> just read them from stdin.
> 
> But looking at the documentation, they are always called "patterns" on
> the command-line. Should the "--stdin" documentation make it clear that
> we are no longer taking patterns, but instead actual filenames?
> 
> Or am I confused, and in non-cone-mode the "ls-tree | sparse-checkout"
> pipeline is not supposed to work at all? (I.e., they really are always
> patterns)?

Hmph, sorry, I _was_ just confused. I was reading a copy of the manpage
without your final patch, which made things much clearer.

So OK, I think the resulting documentation does make things clear. And
this is just about cone mode, not --stdin. Please ignore my ramblings in
the rest of the replied-to message. But...

> > Even more specifically, the goal is to always allow the following from
> > the root of a repo:
> > 
> >   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
> > 
> > The ls-tree command provides directory names with an unescaped asterisk.
> > It also quotes the directories that contain an escaped backslash. We
> > must remove these quotes, then keep the escaped backslashes.
> > 
> > However, there is some care needed for the timing of these escapes. The
> > in-memory pattern list is used to update the working directory before
> > writing the patterns to disk. Thus, we need the command to have the
> > unescaped names in the hashsets for the cone comparisons, then escape
> > the patterns later.
> 
> OK, this part make sense.

You could also demonstrate this even without --stdin with something
like:

  git config core.sparsecheckoutcone true
  git sparse-checkout set 'foo*bar'

which should take that as a literal filename and put the pattern
'foo\*bar' in the sparse-checkout file. And your tests do cover that.

So really there are two separate bugs here, and it might be a little
easier to explain the "timing of these escapes" thing by doing them
separately. I.e., the case above needs escaping and we could demonstrate
the bug with a command-line "set".  And then follow up by fixing the
problem with correctly de-quoting --stdin.

> > +static char *escaped_pattern(char *pattern)
> [...]
> Do we need to catch other metacharacters here (using is_glob_special()
> perhaps)?

After de-confusing myself, I think the individual code comments I wrote
still apply though (especially this one).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux