On 2023-02-05T22:33 Johannes Sixt wrote: > Having seen all these examples, I think the following truncated > expression might do the right thing for all cases that are valid Java: > > "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t].*)$" Only the '\n' is missing at the end, but otherwise I concur, so here's a v3. > i.e., we recognize a whitespace in order to identify the keyword, and > then capture anything that follows without being specific. My reasoning > is that "class", "enum", "interface", and "record" cannot occur in any > other context than the beginning of a class definition. (But please do > correct me; I know next to nothing about Java syntax.) The word "class" can also occur as part of a class literal, for example: Class<String> c = String.class; but valid uses of class literals won't interfere with our regex, unless some wild formatting is applied. This is technically valid Java: Class<String> c = String. class ; and with a space after lowercase "class", the v3 regex will trip. Class literals are described in the JLS here: https://docs.oracle.com/javase/specs/jls/se17/html/jls-15.html#jls-15.8.2 > As always, > userdiff regular expressions can assume that only valid constructs are > inspected. Changes since v2: - simplified regex that doesn't match class names at all and supports more code styles - updated the comment just above the regex in PATCH 2/3 to mention records - more tests to cover the cases mentioned during review of v2 - reworded commit messages to reflect the above items Range diff since v2: 1: c300745a58 ! 1: 9e859e3b79 userdiff: support Java type parameters @@ Metadata ## Commit message ## userdiff: support Java type parameters - A class or interface in Java [1] can have type parameters immediately - following the name in the declaration, surrounded by angle brackets - (paired less than and greater than signs).[2] Example of a class with - type parameters "A" and "N": - - public class ParameterizedClass<A, N> { - private A field1; - private N field2; + A class or interface in Java can have type parameters following the name + in the declared type, surrounded by angle brackets (paired less than and + greater than signs).[2] The type parameters -- `A` and `B` in the + examples -- may follow the class name immediately: + + public class ParameterizedClass<A, B> { } - Support matching a parameterized class or interface declaration with - type parameters immediately following the name of the type in the - builtin userdiff pattern for Java. Do so by just allowing matching the - first character after the name of the type to "<". + or may be separated by whitespace: + + public class SpaceBeforeTypeParameters <A, B> { + } - An alternative approach could be to match both the opening and the - closing angle brackets and matching the content between them in various - ways. Just use the simpler regex for now. + A part of the builtin userdiff pattern for Java matches declarations of + classes, enums, and interfaces. The regular expression requires at + least one whitespace character after the name of the declared type. + This disallows matching for opening angle bracket of type parameters + immediately after the name of the type. Mandatory whitespace after the + name of the type also disallows using the pattern in repositories with a + fairly common code style that puts braces for the body of a class on + separate lines: + + class WithLineBreakBeforeOpeningBrace + { + } + + Support matching Java code in more diverse code styles and declarations + of classes and interfaces with type parameters immediately following the + name of the type in the builtin userdiff pattern for Java. Do so by + just matching anything until the end of the line after the keywords for + the kind of type being declared. [1] Since Java 5 released in 2004. [2] Detailed description is available in the Java Language @@ Commit message Signed-off-by: Andrei Rybak <rybak.a.v@xxxxxxxxx> + ## t/t4018/java-class-brace-on-separate-line (new) ## +@@ ++class RIGHT ++{ ++ static int ONE; ++ static int TWO; ++ static int ChangeMe; ++} + + ## t/t4018/java-class-space-before-type-parameters (new) ## +@@ ++class RIGHT <TYPE, PARAMS, AFTER, SPACE> { ++ static int ONE; ++ static int TWO; ++ static int THREE; ++ private A ChangeMe; ++} + ## t/t4018/java-class-type-parameters (new) ## @@ +class RIGHT<A, B> { @@ userdiff.c: PATTERNS("html", "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n" /* Class, enum, and interface declarations */ - "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)$\n" -+ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|<).*)$\n" ++ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+.*)$\n" /* Method definitions; note that constructor signatures are not */ /* matched because they are indistinguishable from method calls. */ "^[ \t]*(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$", 2: a0e622a0f8 ! 2: 4f7be5f642 userdiff: support Java record types @@ Commit message the name of the record class is followed by a mandatory list of components. The list is enclosed in parentheses, it may be empty, and it may immediately follow the name of the class or type parameters, if - any, without separating whitespace. - - Code examples: + any, with or without separating whitespace. For example: public record Example(int i, String s) { } @@ Commit message public record WithTypeParameters<A, B>(A a, B b, String s) { } + record SpaceBeforeComponents (String comp1, int comp2) { + } + Support records in the builtin userdiff pattern for Java. Add "record" - to the alternatives of keywords for kinds of class, and match an opening - parenthesis as the first character right after the type name. + to the alternatives of keywords for kinds of class. - An alternative approach could be to have an optional group that would - match both the opening and the closing parentheses with some way of - matching the declarations of the components. Just use the simpler - regular expression for now. + Allowing matching various possibilities for the type parameters and/or + list of the components of a record has already been covered by the + preceding patch. [1] detailed description is available in "JEP 395: Records" https://openjdk.org/jeps/395 @@ t/t4018/java-record (new) + static int TWO; + static int THREE; + static int ChangeMe; ++} + + ## t/t4018/java-record-space-before-components (new) ## +@@ ++public record RIGHT (String components, String after, String space) { ++ static int ONE; ++ static int TWO; ++ static int THREE; ++ static int ChangeMe; +} ## t/t4018/java-record-type-parameters (new) ## @@ t/t4018/java-record-type-parameters (new) ## userdiff.c ## @@ userdiff.c: PATTERNS("html", + "[^<>= \t]+"), PATTERNS("java", "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n" - /* Class, enum, and interface declarations */ -- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|<).*)$\n" -+ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n" +- /* Class, enum, and interface declarations */ +- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+.*)$\n" ++ /* Class, enum, interface, and record declarations */ ++ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+.*)$\n" /* Method definitions; note that constructor signatures are not */ /* matched because they are indistinguishable from method calls. */ "^[ \t]*(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$", 3: b9c6a5dffd ! 3: ea6ce671ef userdiff: support Java sealed classes @@ userdiff.c @@ userdiff.c: PATTERNS("html", PATTERNS("java", "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n" - /* Class, enum, and interface declarations */ -- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n" -+ "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n" + /* Class, enum, interface, and record declarations */ +- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+.*)$\n" ++ "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+.*)$\n" /* Method definitions; note that constructor signatures are not */ /* matched because they are indistinguishable from method calls. */ "^[ \t]*(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$", Andrei Rybak (3): userdiff: support Java type parameters userdiff: support Java record types userdiff: support Java sealed classes t/t4018/java-class-brace-on-separate-line | 6 ++++++ t/t4018/java-class-space-before-type-parameters | 6 ++++++ t/t4018/java-class-type-parameters | 6 ++++++ t/t4018/java-class-type-parameters-implements | 6 ++++++ t/t4018/java-interface-type-parameters | 6 ++++++ t/t4018/java-interface-type-parameters-extends | 6 ++++++ t/t4018/java-non-sealed | 8 ++++++++ t/t4018/java-record | 6 ++++++ t/t4018/java-record-space-before-components | 6 ++++++ t/t4018/java-record-type-parameters | 6 ++++++ t/t4018/java-sealed | 7 +++++++ t/t4018/java-sealed-permits | 6 ++++++ t/t4018/java-sealed-type-parameters | 6 ++++++ t/t4018/java-sealed-type-parameters-implements-permits | 6 ++++++ t/t4018/java-sealed-type-parameters-permits | 6 ++++++ userdiff.c | 4 ++-- 16 files changed, 95 insertions(+), 2 deletions(-) create mode 100644 t/t4018/java-class-brace-on-separate-line create mode 100644 t/t4018/java-class-space-before-type-parameters create mode 100644 t/t4018/java-class-type-parameters create mode 100644 t/t4018/java-class-type-parameters-implements create mode 100644 t/t4018/java-interface-type-parameters create mode 100644 t/t4018/java-interface-type-parameters-extends create mode 100644 t/t4018/java-non-sealed create mode 100644 t/t4018/java-record create mode 100644 t/t4018/java-record-space-before-components create mode 100644 t/t4018/java-record-type-parameters create mode 100644 t/t4018/java-sealed create mode 100644 t/t4018/java-sealed-permits create mode 100644 t/t4018/java-sealed-type-parameters create mode 100644 t/t4018/java-sealed-type-parameters-implements-permits create mode 100644 t/t4018/java-sealed-type-parameters-permits -- 2.39.1