Re: [RFC/PATCH] diff: funcname and word patterns for perl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Rast wrote:

> I just took the laziest (and most obvious) approach possible when I
> wrote the original patterns.  I think the second most laziest one
> would be to observe that bit patterns for leading characters are
> always 11.., while those for continuation chars are 10..
> 
> So that gives
> 
>   |[\xc0-\xff][\x80-\xbf]+

Yes, that's what I was thinking of.  v2 will be a two-part series
starting with that.

BTW, the perl token matcher is pretty half-hearted.  In part this is
because "only perl can parse perl" [1] terrifies me and in part it is
because I am too lazy to write down the state machine implied by
PPI/Token/*.pm.

If some tokenization wizard would like to work on it, something like
the following might produce more pleasant word diffs:

	"[%&$][[:space:]]*[0-9]+"	/* $1 */
	"|[%&$][[:space:]]*([[:alpha:]_']|::)([[:alnum:]_']|::)*"	/* $var1 */
	"|[%&$][[:space:]]*\\$([[:alnum:]_]|::)([[:alnum:]_']|::)*"	/* $$var1 */
	"|[%&$][[:space:]]*\\$\\{"     /* $${ introducing complicated expression */
	"|[%&$][[:space:]]*\\$\\$"     /* $$$ introducing complicated expression */
	"|[%&$][[:space:]]*[^[:alnum:]_:'^$]"	/* $! */
	"|[%&$][[:space:]]*\\^[][A-Z\\^_?]"	/* $^A */
	"|[%&$][[:space:]]*\\{\\^[][A-Z\\^_?]\\}"	/* ${^A} */
	"|[%&$][[:space:]]*\\{\\^[][A-Z\\^_?][[:alnum:]_]*\\}" /* ${^Foo} */
	/* ${var} */
	"|[%&$][[:space:]]*\\{[[:space:]]*([[:alpha:]_']|::)[[:alnum:]_:]*[[:space:]]\\}"
	"|[%&$][[:space:]]*\\{"	/* ${ introducing complicated expression */
	...

though it is an unmaintainable mess. :)

[1] perl::toke.c and http://www.perlmonks.org/?node_id=44722
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]