Re: [PATCH v4] userdiff: improve java hunk header regex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 11.08.21 um 07:22 schrieb Tassilo Horn:
> Johannes Sixt <j6t@xxxxxxxx> writes:
>> These new tests are very much appreciated. You do not have to go wild
>> with that many return type tests; IMO, the simple one and the most
>> complicated one should do it. (And btw, s/cart/card/)
> 
> Well, they appeared naturally as a result during development and made it
> easier to spot errors when you know up to which level of complexity it
> still worked.  Is there a stronger reason to remove tests which might
> not be needed, e.g., runtime cost on some CI machines?

I totally understand how the test cases evolved. Having many of them is
not a big deal. It's just the disproportion of tests of this new feature
vs. the existing tests that your patch creates, in particular, when
earlier of the new tests are subsumed by later new tests.

> Another thing I've noticed (with my suggested patch) is that I should
> not try to match constructor signatures.  I think that's impossible
> because they are indistinguishable from method calls, e.g., in
> 
>   public class MyClass {
>       MyClass(String RIGHT) {
>           someMethodCall();
>           someOtherMethod(17)
>               .doThat();
>           // Whatever
>           // ChangeMe
>       }
>   }
> 
> there is no regex way to prefer MyClass(String RIGHT) over
> someOtherMethod().

Good find.

> So all in all, I'd propose this version in the next patch version:
> 
> --8<---------------cut here---------------start------------->8---
> PATTERNS("java",
> 	 "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
>          "^[ \t]*("
>          /* Class, enum, and interface declarations */
>          "(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)"
>          /* Method definitions; note that constructor signatures are not */
>          /* matched because they are indistinguishable from method calls. */
>          "|(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)"
>          ")$",
> 	 /* -- */
> 	 "[a-zA-Z_][a-zA-Z0-9_]*"
> 	 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
> 	 "|[-+*/<>%&^|=!]="
> 	 "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"),
> --8<---------------cut here---------------end--------------->8---

That looks fine.

One suggestion, though. You do not have to have all positive patterns
("class, enum, interface" and "method definitions") in a single pattern
separated by "|". You can place them on different "lines" (note the "\n"
at the end of the first pattern):

	/* Class, enum, and interface declarations */
	"^[ \t]*(...(class|enum|interface)...)$\n"
	/*
	 * Method definitions; note that constructor signatures are not
	 * matched because they are indistinguishable from method calls.
	 */
	"^[ \t]*(...[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*))$",

I don't think there is a technical difference, but I find this form
easier to understand because fewer open parentheses have to be tracked.

-- Hannes



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux