Re: [GSoC][PATCH v2 1/1] userdiff: add support for scheme

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05-Apr-2021, at 15:51, Phillip Wood <phillip.wood123@xxxxxxxxx> wrote:
> 
> Hi Atharva
> On 03/04/2021 14:16, Atharva Raykar wrote:
>> Add a diff driver for Scheme-like languages which recognizes top level
>> and local `define` forms, whether it is a function definition, binding,
>> syntax definition or a user-defined `define-xyzzy` form.
>> Also supports R6RS `library` forms, `module` forms along with class and
>> struct declarations used in Racket (PLT Scheme).
>> Alternate "def" syntax such as those in Gerbil Scheme are also
>> supported, like defstruct, defsyntax and so on.
>> The rationale for picking `define` forms for the hunk headers is because
>> it is usually the only significant form for defining the structure of
>> the program, and it is a common pattern for schemers to have local
>> function definitions to hide their visibility, so it is not only the top
>> level `define`'s that are of interest. Schemers also extend the language
>> with macros to provide their own define forms (for example, something
>> like a `define-test-suite`) which is also captured in the hunk header.
>> Since it is common practice to extend syntax with variants of a form
>> like `module+`, `class*` etc, those have been supported as well.
>> The word regex is a best-effort attempt to conform to R6RS[1] valid
>> identifiers, symbols and numbers.
>> [1] http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_chap_4
>> Signed-off-by: Atharva Raykar <raykar.ath@xxxxxxxxx>
>> [...]
>> diff --git a/userdiff.c b/userdiff.c
>> index 3f81a2261c..ac1999bbc5 100644
>> --- a/userdiff.c
>> +++ b/userdiff.c
>> @@ -191,6 +191,10 @@ PATTERNS("rust",
>>  	 "[a-zA-Z_][a-zA-Z0-9_]*"
>>  	 "|[0-9][0-9_a-fA-Fiosuxz]*(\\.([0-9]*[eE][+-]?)?[0-9_fF]*)?"
>>  	 "|[-+*\\/<>%&^|=!:]=|<<=?|>>=?|&&|\\|\\||->|=>|\\.{2}=|\\.{3}|::"),
>> +PATTERNS("scheme",
>> +	 "^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$",
>> +	 /* All words should be delimited by spaces or parentheses */
>> +	 "([^][)(}{[ \t])+"),
> 
> I think it would be nice to match single '(' and '[' to highlight when they have been added or deleted - I find this useful when I get a syntax error. Also it would be nice to handle r7rs identifiers like | this is a symbol |. Maybe something like
> "(\\|([^\\\\|]*(\\\\|)*)*\\||[^][}{)( \t]|[][(){}])"

My patch seems to detect additions and removals of singular parentheses
already -- I am not sure why it works, but my suspicion is that the
userdiff code seems to fall back to some default rules for additions and
removals that do not match the current word regex? Either way that seems
to work.

As for the R7RS identifiers, I can definitely add that, thanks for
pointing that out!

> Best Wishes
> 
> Phillip
> 
>>  PATTERNS("bibtex", "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$",
>>  	 "[={}\"]|[^={}\" \t]+"),
>>  PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux