On Wed, Jan 29, 2020 at 05:17:13AM -0500, Jeff King wrote: > > From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > > > If a user somehow creates a directory with an asterisk (*) or backslash > > (\), then the "git sparse-checkout set" command will struggle to provide > > the correct pattern in the sparse-checkout file. When not in cone mode, > > the provided pattern is written directly into the sparse-checkout file. > > However, in cone mode we expect a list of paths to directories and then > > we convert those into patterns. > > Is this really about cone mode? It seems more like it is about --stdin. > I.e., what are the rules for when the input is a filename and when it is > a pattern? In our earlier discussion, I had assumed that command-line > arguments to "sparse-checkout set" were actual filenames, and "--stdin" > just read them from stdin. > > But looking at the documentation, they are always called "patterns" on > the command-line. Should the "--stdin" documentation make it clear that > we are no longer taking patterns, but instead actual filenames? > > Or am I confused, and in non-cone-mode the "ls-tree | sparse-checkout" > pipeline is not supposed to work at all? (I.e., they really are always > patterns)? Hmph, sorry, I _was_ just confused. I was reading a copy of the manpage without your final patch, which made things much clearer. So OK, I think the resulting documentation does make things clear. And this is just about cone mode, not --stdin. Please ignore my ramblings in the rest of the replied-to message. But... > > Even more specifically, the goal is to always allow the following from > > the root of a repo: > > > > git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin > > > > The ls-tree command provides directory names with an unescaped asterisk. > > It also quotes the directories that contain an escaped backslash. We > > must remove these quotes, then keep the escaped backslashes. > > > > However, there is some care needed for the timing of these escapes. The > > in-memory pattern list is used to update the working directory before > > writing the patterns to disk. Thus, we need the command to have the > > unescaped names in the hashsets for the cone comparisons, then escape > > the patterns later. > > OK, this part make sense. You could also demonstrate this even without --stdin with something like: git config core.sparsecheckoutcone true git sparse-checkout set 'foo*bar' which should take that as a literal filename and put the pattern 'foo\*bar' in the sparse-checkout file. And your tests do cover that. So really there are two separate bugs here, and it might be a little easier to explain the "timing of these escapes" thing by doing them separately. I.e., the case above needs escaping and we could demonstrate the bug with a command-line "set". And then follow up by fixing the problem with correctly de-quoting --stdin. > > +static char *escaped_pattern(char *pattern) > [...] > Do we need to catch other metacharacters here (using is_glob_special() > perhaps)? After de-confusing myself, I think the individual code comments I wrote still apply though (especially this one). -Peff