Re: [PATCH] userdiff: add built-in pattern for CSS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



William Duclot <william.duclot@xxxxxxxxxxxxxxxxxxxxxxx> writes:

> CSS is widely used, motivating it being included as a built-in pattern.
>
> It must be noted that the word_regex for CSS (i.e. the regex defining
> what is a word in the language) does not consider '.' and '#' characters
> (in CSS selectors) to be part of the word. This behavior is documented
> by the test t/t4018/css-rule.
> The logic behind this behavior is the following: identifiers in CSS
> selectors are identifiers in a HTML/XML document. Therefore, the '.'/'#'
> character are not part of the identifier, but an indicator of the nature
> of the identifier in HTML/XML (class or id). Diffing ".class1" and
> ".class2" must show that the class name is changed, but we still are
> selecting a class.
>
> Signed-off-by: William Duclot <william.duclot@xxxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Matthieu Moy <matthieu.moy@xxxxxxxxxxxxxxx>
> ---
> changes since v1:
> Fix a typo in the word_regex ("A-F" => "A-Z").
> Clearer comment about ISO 10646 characters.

It is not a big deal for a small single-patch topic like this, but
it often is hard to reviewers if you do not respond to comments you
received and instead just send a new version of the patch with
"changes since..." comment.  Please make it a habit to do both.

I can see in the above "changes since v1" comment, you took the
A-F/A-Z thing, but I cannot tell if you thought about PATTERNS vs
IPATTERN and rejected IPATTERN with a good reason or if you simply
missed it when reading the review comments you received, for
example.

Three remaining issues are:

 - Have you considered using IPATTERN()?  PATTERNS() that defaults
   case sensitive match is suitable for real languages with fixed
   case keywords, but the pattern you are defining for CSS does not
   do anything special for any set of fixed-case built-in keywords,
   and appears to be better served by IPATTERN().

 - In our codebase, we format multi-line comments in a particular
   way, namely

	/*
         * A multi-line comment begins with slash asterisk
         * on its own line, and its closing asterisk slash
         * also is on its own line.
         */

 - Try not to write overlong lines.  If your expression needs to
   become long and there is no good place to fold lines, that is one
   thing, but an overlong comment is unexcuable, as you can fold
   lines anywhere between words.

Thanks.

> diff --git a/userdiff.c b/userdiff.c
> index 6bf2505..9273969 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -148,6 +148,14 @@ PATTERNS("csharp",
>  	 "[a-zA-Z_][a-zA-Z0-9_]*"
>  	 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
>  	 "|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->"),
> +PATTERNS("css",
> +	 "^([^,{}]+)((,[^}]*\\{)|([ \t]*\\{))$",
> +	 /* -- */
> +	 /* This regex comes from W3C CSS specs. Should theoretically also allow ISO 10646 characters U+00A0 and higher,
> +	  * but they are not handled with this regex. */
> +	 "-?[_a-zA-Z][-_a-zA-Z0-9]*" /* identifiers */
> +	 "|-?[0-9]+|\\#[0-9a-fA-F]+" /* numbers */
> +),
>  { "default", NULL, -1, { NULL, 0 } },
>  };
>  #undef PATTERNS

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]