On Thu, Jun 2, 2016 at 2:54 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Stefan Beller <sbeller@xxxxxxxxxx> writes: > >> In our own .gitattributes file we have attributes such as: >> >> *.[ch] whitespace=indent,trail,space >> >> When querying for attributes we want to be able to ask for the exact >> value, i.e. >> >> git ls-files :(attr:whitespace=indent,trail,space) >> >> should work, but the commas are used in the attr magic to introduce >> the next attr, ... >> ... >> So here is the "escaping only, but escaping done right" version. >> (It goes on top of sb/pathspec-label) > > The phrase "should work" is probably a bit too strong (I'd have said > "it would be nice if this worked"), as we do not have to even > support comma for our immediately expected use cases. Allowing it > merely makes a casual test using our own .gitattributes easier. > >> +static size_t strcspn_escaped(const char *s, const char *reject) > > Perhaps s/reject/stop/? > >> +{ >> + const char *i, *j; >> + >> + for (i = s; *i; i++) { >> + /* skip escaped the character */ >> + if (i[0] == '\\' && i[1]) { >> + i++; >> + continue; >> + } >> + /* see if any of the chars matches the current character */ >> + for (j = reject; *j; j++) >> + if (!*i || *i == *j) >> + return i - s; > > I somehow doubt that *i can be NUL here. In any case, this looks > more like > > /* is *i is one of the stop codon? */ > if (strchr(stop, *i)) > break; > >> + } >> + return i - s; >> +} > >> +static char *attr_value_unescape(const char *value) >> +{ >> + char *i, *ret = xstrdup(value); >> + >> + for (i = ret; *i; i++) { >> + if (i[0] == '\\') { >> + if (!i[1]) >> + die(_("Escape character '\\' not allowed as " >> + "last character in attr value")); >> + >> + /* remove the backslash */ >> + memmove(i, i + 1, strlen(i)); >> + /* and ignore the character after that */ >> + i++; >> + } >> + } >> + return ret; >> +} >> + > > Repeated memmove() and strlen() somehow bothers me. Would there be > a more efficient and straight-forward way to do this, perhaps along > the lines of this instead? Thinking about efficiency, I have the believe that memmove can be faster than a `*src=*dst` thing we do ourselves as it may have access to specialized assembly instructions to move larger chunks of memory or such. So I think ideally we would do a block copy between the escape characters (sketched as:) last = input while input not ended: current = find next escape character in input memcpy from input value in the range of last to current last = current + 1 copy remaining parts if no further escape is found It doesn't seem worth the effort to get it right though. > > const char *src; > char *dst, *ret; > > ret = xmalloc(strlen(value)); xmallocz at least? > for (src = value, dst = ret; *src; src++, dst++) { > if (*src == '\\') { > if (!src[1]) > die(); > src++; > } > if (*src && invalid_value_char(*src)) > die("cannot use '%c' for value matching", *src) > *dst = *src; > } > *dst = '\0' > return ret; > Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html