Johannes Sixt <j6t@xxxxxxxx> writes: > I don't see the point in this complicated regex. Please recall that it > will be applied only to syntactically correct Java text. Therefore, you > do not have to implement all syntactical corner cases, just be > sufficiently permissive. Good suggestion. We may want to mention the above principle as a comment near the top of the patterns array. > What is wrong with > > "^[ \t]*(([A-Za-z_][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ > \t]*\\([^;]*)$", > > i.e. take every "token" until an identifier followed by an opening > parenthesis is found. Can types in Java contain parentheses? That would > make my suggested simplified regex too permissive, but otherwise it > would do its job, I would think. Thanks. ---- >8 -------- >8 -------- >8 -------- >8 -------- >8 -------- Subject: userdiff: comment on the builtin patterns Remind developers that they do not need to go overboard to implement patterns to prepare for invalid constructs. They only have to be sufficiently permissive, assuming that the payload is syntactically correct. Text stolen mostly from Johannes Sixt. Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> --- userdiff.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git c/userdiff.c w/userdiff.c index d9b2ba752f..1a6d27fda6 100644 --- c/userdiff.c +++ w/userdiff.c @@ -13,6 +13,16 @@ static int drivers_alloc; #define IPATTERN(name, pattern, word_regex) \ { name, NULL, -1, { pattern, REG_EXTENDED | REG_ICASE }, \ word_regex "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" } + +/* + * Built-in drivers for various languages, sorted by their names + * (except that the "default" is left at the end). + * + * When writing or updating patterns, assume that the contents these + * patterns are applied to are syntactically correct. You do not have + * to implement all syntactical corner cases---the patterns have to be + * sufficiently permissive. + */ static struct userdiff_driver builtin_drivers[] = { IPATTERN("ada", "!^(.*[ \t])?(is[ \t]+new|renames|is[ \t]+separate)([ \t].*)?$\n"