Am 08.02.23 um 00:42 schrieb Andrei Rybak: > On 2023-02-05T22:33 Johannes Sixt wrote: >> Having seen all these examples, I think the following truncated >> expression might do the right thing for all cases that are valid Java: >> >> "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t].*)$" > > Only the '\n' is missing at the end, but otherwise I concur, so here's a v3. > >> i.e., we recognize a whitespace in order to identify the keyword, and >> then capture anything that follows without being specific. My reasoning >> is that "class", "enum", "interface", and "record" cannot occur in any >> other context than the beginning of a class definition. (But please do >> correct me; I know next to nothing about Java syntax.) > > The word "class" can also occur as part of a class literal, for example: > > Class<String> c = String.class; > > but valid uses of class literals won't interfere with our regex, unless some > wild formatting is applied. This is technically valid Java: > > Class<String> c = String. > class > ; > > and with a space after lowercase "class", the v3 regex will trip. Yeah, let's assume that nobody writes code like this. This iteration is all good! Reviewed-by: Johannes Sixt <j6t@xxxxxxxx> -- Hannes