Re: [PATCH v4] userdiff: improve java hunk header regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 10.08.21 um 21:09 schrieb Tassilo Horn:
> Currently, the git diff hunk headers show the wrong method signature if the
> method has a qualified return type, an array return type, or a generic return
> type because the regex doesn't allow dots (.), [], or < and > in the return
> type.  Also, type parameter declarations couldn't be matched.
> 
> Add several t4018 tests asserting the right hunk headers for increasingly
> complex method signatures:
> 
>   public String[] myMethod(String[] RIGHT)
>   public List<String> myMethod(String[] RIGHT)
>   public <T> List<T> myMethod(T[] RIGHT)
>   public <AType, B> Map<AType, B> myMethod(String[] RIGHT)
>   public <AType, B> java.util.Map<AType, Map<B, B[]>> myMethod(String[] RIGHT)
>   public List<? extends Comparable> myMethod(String[] RIGHT)
>   public <T extends Serializable & Comparable<T>> List<T> myMethod(String[] RIGHT)
> 
> Signed-off-by: Tassilo Horn <tsdh@xxxxxxx>
> ---
>  t/t4018/java-constructor             |  6 ++++++
>  t/t4018/java-enum-constant           |  6 ++++++
>  t/t4018/java-nested-field            |  6 ++++++
>  t/t4018/java-return-array            |  6 ++++++
>  t/t4018/java-return-generic          |  6 ++++++
>  t/t4018/java-return-generic-bounded  |  6 ++++++
>  t/t4018/java-return-generic-wildcart |  6 ++++++
>  t/t4018/java-return-generic2         |  6 ++++++
>  t/t4018/java-return-generic3         |  6 ++++++
>  t/t4018/java-return-generic4         |  6 ++++++
>  userdiff.c                           | 23 ++++++++++++++++++++++-
>  11 files changed, 82 insertions(+), 1 deletion(-)
>  create mode 100644 t/t4018/java-constructor
>  create mode 100644 t/t4018/java-enum-constant
>  create mode 100644 t/t4018/java-nested-field
>  create mode 100644 t/t4018/java-return-array
>  create mode 100644 t/t4018/java-return-generic
>  create mode 100644 t/t4018/java-return-generic-bounded
>  create mode 100644 t/t4018/java-return-generic-wildcart
>  create mode 100644 t/t4018/java-return-generic2
>  create mode 100644 t/t4018/java-return-generic3
>  create mode 100644 t/t4018/java-return-generic4
> 

These new tests are very much appreciated. You do not have to go wild
with that many return type tests; IMO, the simple one and the most
complicated one should do it. (And btw, s/cart/card/)

> diff --git a/t/t4018/java-return-array b/t/t4018/java-return-array
> new file mode 100644
> index 0000000000..747638b9a8
> --- /dev/null
> +++ b/t/t4018/java-return-array
> @@ -0,0 +1,6 @@
> +class MyExample {
> +    public String[] myMethod(String[] RIGHT) {
> +        // Whatever...
> +        return new; // ChangeMe
> +    }
> +}
> diff --git a/userdiff.c b/userdiff.c
> index 3c3bbe38b0..9bd751b7d2 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -142,7 +142,28 @@ PATTERNS("html",
>  	 "[^<>= \t]+"),
>  PATTERNS("java",
>  	 "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
> -	 "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
> +         "^[ \t]*("
> +         /* Class, enum, and interface declarations: */
> +         /*   optional modifiers: public */
> +         "(([a-z]+[ \t]+)*"
> +         /*   the kind of declaration */
> +         "(class|enum|interface)[ \t]+"
> +         /*   the name */
> +         "[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)"
> +         /* Method & constructor signatures: */
> +         /*   optional modifiers: public static */
> +         "|(([a-z]+[ \t]+)*"
> +         /*   type params and return types for methods but not constructors */
> +         "("
> +         /*     optional type parameters: <A, B extends Comparable<B>> */
> +         "(<[A-Za-z0-9_,.&<> \t]+>[ \t]+)?"
> +         /*     return type: java.util.Map<A, B[]> or List<?> */
> +         "([A-Za-z_]([A-Za-z_0-9<>,.?]|\\[[ \t]*\\])*[ \t]+)+"
> +         /*   end of type params and return type */
> +         ")?"
> +         /*   the method name followed by the parameter list: myMethod(...) */
> +         "[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)"
> +         ")$",

I don't see the point in this complicated regex. Please recall that it
will be applied only to syntactically correct Java text. Therefore, you
do not have to implement all syntactical corner cases, just be
sufficiently permissive.

What is wrong with

	"^[ \t]*(([A-Za-z_][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[
\t]*\\([^;]*)$",

i.e. take every "token" until an identifier followed by an opening
parenthesis is found. Can types in Java contain parentheses? That would
make my suggested simplified regex too permissive, but otherwise it
would do its job, I would think.

-- Hannes



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux